R86s U2 SFP+ interfaces not showing up (2024)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

S

sketch

New Member
Nov 1, 2023
9
0
1

Nov 1, 2023

  • #1

Hi, I recently got an R86s U2 but I'm having trouble getting the SPF+ interfaces to work.

I can see the three 2.5Gbps interfaces but not the two SFP+ ones.

Code:

$ ip l | grep -P '^\d'1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 10002: enp1s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 10003: enp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 10004: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000$ lspci| grep -i eth01:00.0 Ethernet controller: Intel Corporation Device 125c (rev 04)02:00.0 Ethernet controller: Intel Corporation Device 125c (rev 04)03:00.0 Ethernet controller: Intel Corporation Device 125c (rev 04)05:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3]

if I check dmesg I see some errors from mlx4_core.

Code:

$ lsmod | grep mlxmlx4_core 405504 0$ sudo dmesg | grep -i mlx[ 1.397624] mlx4_core: Mellanox ConnectX core driver v4.0-0[ 1.397657] mlx4_core: Initializing 0000:05:00.0[ 67.788745] mlx4_core 0000:05:00.0: command 0x34 timed out (go bit not cleared)[ 67.788758] mlx4_core 0000:05:00.0: device is going to be reset[ 67.788761] mlx4_core 0000:05:00.0: crdump: FW doesn't support health buffer access, skipping[ 68.805242] mlx4_core 0000:05:00.0: device was reset successfully[ 68.805246] mlx4_core 0000:05:00.0: Failed to override log_pg_sz parameter[ 68.805248] mlx4_core 0000:05:00.0: Failed to init fw, aborting.[ 69.829020] mlx4_core: probe of 0000:05:00.0 failed with error -5$ sudo mstconfig q-E- Failed to open device: /sys/bus/pci/devices/0000:05:00.0/config. Cannot perform operation, Driver might be down.

Has anyone run into this before? I'm hoping I'm missing something and didn't get a dud.

I'm running Ubuntu 22.04 although I also booted into the Openwrt install it shipped with and didn't see the interfaces there either.

Code:

$ cat /etc/lsb-releaseDISTRIB_ID=UbuntuDISTRIB_RELEASE=22.04DISTRIB_CODENAME=jammyDISTRIB_DESCRIPTION="Ubuntu 22.04.3 LTS"$ uname -aLinux r86su2 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

S

sketch

New Member
Nov 1, 2023
9
0
1

Nov 1, 2023

  • #2

I thought maybe the firmware needed to be updated but best I can tell 2.42.5 should be fine.

Code:

$ sudo mstflint -d 5:00.0 qImage type: FS2FW Version: 2.42.5000FW Release Date: 5.9.2017Product Version: 02.42.50.00Rom Info: type=UEFI version=14.11.45 cpu=AMD64 type=PXE version=3.4.752Device ID: 4099Description: Node Port1 Port2 Sys imageGUIDs: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffffMACs: f45214cd2d06 f45214cd2d07VSD: PSID: MT_1680110023

S

sko

Active Member
Jun 11, 2021
265
137
43

Nov 2, 2023

  • #3

Usually Mellanox ConnectX support ethernet and infiniband, so you'd need to load the appropriate driver for it to be switched in the corresponding mode. The 'core' driver on its own doesn't register any device in the OS.

No idea how Linux handles this, on BSDs to get it running in ethernet mode, you'd simply load the mlx[4/5]en (FreeBSD) or mxc (OpenBSD) driver via rc.conf and thats it...

  • R86s U2 SFP+ interfaces not showing up (1)

Reactions:

sketch

S

sketch

New Member
Nov 1, 2023
9
0
1

Nov 2, 2023

  • #4

Thanks for the suggestion, sko.

I booted Opnsense since I found several references online of people using it successfully with the R86S but no difference there. I tried loading mlx4_en with `kldload mlx4en` but no cigar.

I enabled debugging in mlx4-core and tried loading mlx4-en in ubuntu as well but still no luck.

Code:

$ cat /etc/modprobe.d/mlx4.confoptions mlx4_core debug_level=1$ sudo modprobe -vv mlx4-enmodprobe: INFO: ../libkmod/libkmod.c:367 kmod_set_log_fn() custom logging function 0x55c0e0783830 registeredinsmod /lib/modules/5.15.0-88-generic/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko debug_level=1insmod /lib/modules/5.15.0-88-generic/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_en.komodprobe: INFO: ../libkmod/libkmod.c:334 kmod_unref() context 0x55c0e1742440 released$ lsmod | grep mlxmlx4_en 155648 0mlx4_core 405504 1 mlx4_en

The debug logging got me some additional error messages in dmesg.

Code:

$ sudo dmesg | grep mlx[ 542.242739] mlx4_core: Mellanox ConnectX core driver v4.0-0[ 542.242773] mlx4_core: Initializing 0000:05:00.0[ 543.270559] mlx4_core 0000:05:00.0: FW version 2.42.5000 (cmd intf rev 3), max commands 16[ 543.270563] mlx4_core 0000:05:00.0: Catastrophic error buffer at 0x1f020, size 0x10, BAR 0[ 543.270566] mlx4_core 0000:05:00.0: Communication vector bar:2 offset:0x800[ 543.270567] mlx4_core 0000:05:00.0: FW size 385 KB[ 543.270569] mlx4_core 0000:05:00.0: Internal clock bar:0 offset:0x78f50[ 543.270570] mlx4_core 0000:05:00.0: Clear int @ f0058, BAR 0[ 543.274135] mlx4_core 0000:05:00.0: Mapped 26 chunks/6168 KB for FW[ 608.653452] mlx4_core 0000:05:00.0: command 0x34 timed out (go bit not cleared)[ 608.653458] mlx4_core 0000:05:00.0: device is going to be reset[ 608.653484] mlx4_core 0000:05:00.0: crdump: FW doesn't support health buffer access, skipping[ 609.669943] mlx4_core 0000:05:00.0: device was reset successfully[ 609.669968] mlx4_core 0000:05:00.0: Failed to override log_pg_sz parameter[ 609.669970] mlx4_core 0000:05:00.0: Failed to init fw, aborting.[ 610.693692] mlx4_core: probe of 0000:05:00.0 failed with error -5

Looking at the output of `lspci` it still shows the driver as `mlx4_core`. I suspect mlx4-en depends on mlx4-core to successfully initialize the device but I don't know.

Code:

05:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] Flags: fast devsel, IRQ 29 Memory at 7fd00000 (64-bit, non-prefetchable) [size=1M] Memory at 6000000000 (64-bit, prefetchable) [size=8M] Expansion ROM at 7fc00000 [disabled] [size=1M] Capabilities: [40] Power Management version 3 Capabilities: [48] Vital Product Data Capabilities: [9c] MSI-X: Enable- Count=128 Masked- Capabilities: [60] Express Endpoint, MSI 00 Capabilities: [c0] Vendor Specific Information: Len=18 <?> Capabilities: [100] Alternative Routing-ID Interpretation (ARI) Capabilities: [148] Device Serial Number ███████████████████████ Capabilities: [154] Advanced Error Reporting Capabilities: [18c] Secondary PCI Express Kernel modules: mlx4_core

I also tried setting the port type to ethernet in mlx4.conf and reloading mlx4-en but I'm still seeing the same errors in dmesg.

Code:

$ cat /etc/modprobe.d/mlx4.confoptions mlx4_core debug_level=1 port_type_array=2,2$ sudo modprobe -vv mlx4-enmodprobe: INFO: ../libkmod/libkmod.c:367 kmod_set_log_fn() custom logging function 0x56440c724830 registeredinsmod /lib/modules/5.15.0-88-generic/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko debug_level=1 port_type_array=2,2insmod /lib/modules/5.15.0-88-generic/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_en.komodprobe: INFO: ../libkmod/libkmod.c:334 kmod_unref() context 0x56440c97b440 released

Last edited:

S

sko

Active Member
Jun 11, 2021
265
137
43

Nov 2, 2023

  • #5

try setting the "sys.device.mlx4_core0.mlx4_port1" sysctl to "eth". I just checked some FreeBSD hosts with ConnectX3 NICs and I've also set that sysctl on those.

S

sketch

New Member
Nov 1, 2023
9
0
1

Nov 2, 2023

  • #6

sko said:

try setting the "sys.device.mlx4_core0.mlx4_port1" sysctl to "eth". I just checked some FreeBSD hosts with ConnectX3 NICs and I've also set that sysctl on those.

No luck.

I loaded mlx4en with kldload mlx4en and then tried sysctl sys.device.mlx4_core0.mlx4_port1=eth but I get an unknown oid error.

M

MrTeeJay

New Member
Feb 19, 2019
6
4
3

Nov 2, 2023

  • #7

Curious question, have you installed any SFP modules? I don't think anything shows up if you haven't anything installed in the SFP ports...

S

sketch

New Member
Nov 1, 2023
9
0
1

Nov 2, 2023

  • #8

MrTeeJay said:

Curious question, have you installed any SFP modules? I don't think anything shows up if you haven't anything installed in the SFP ports...

Yes I’ve put an Ethernet transceiver in one port and a fiber one in the other. Both transceivers work in CX3 cards I have in other machines.

Come to think of it those other CX3s work out of the box with Ubuntu. I can see mlx4_core and mlx4_en loaded on them in dmesg without errors as well.

I’m starting to wonder if there’s something wrong with the R86S although I’m not sure what my options will be if that’s the case since I ordered it from aliexpress.

S

sko

Active Member
Jun 11, 2021
265
137
43

Nov 3, 2023

  • #9

sketch said:

No luck.

I loaded mlx4en with kldload mlx4en and then tried sysctl sys.device.mlx4_core0.mlx4_port1=eth but I get an unknown oid error.

what's the output of 'sysctl sys | grep mlx' and 'sysctl dev | grep mlx'?

Also I *strongly* suggest not using OPNsense or PFsense but vanilla FreeBSD. Especially the latter ones are using the development branch, which was NEVER intended for production use (and *will* break from time to time, like during the latest change to OpenSSL3 in base) and they also often used beta drivers (with known results...)
Also the crappy middleware will constantly overwrite/change settings you made the "right way" and make easy things hard or impossible.
There's a good chance the NICs won't work because of one/some of those reasons or because they don't include some drivers.

I'm running 6 FreeBSD hosts (12.4-RELEASE & 13.2-RELEASE) with ConnectX3&4 NICs and those NICs work fine there...

S

sketch

New Member
Nov 1, 2023
9
0
1

Nov 3, 2023

  • #10

sko said:

what's the output of 'sysctl sys | grep mlx' and 'sysctl dev | grep mlx'?

Also I *strongly* suggest not using OPNsense or PFsense but vanilla FreeBSD. Especially the latter ones are using the development branch, which was NEVER intended for production use (and *will* break from time to time, like during the latest change to OpenSSL3 in base) and they also often used beta drivers (with known results...)
Also the crappy middleware will constantly overwrite/change settings you made the "right way" and make easy things hard or impossible.
There's a good chance the NICs won't work because of one/some of those reasons or because they don't include some drivers.

I'm running 6 FreeBSD hosts (12.4-RELEASE & 13.2-RELEASE) with ConnectX3&4 NICs and those NICs work fine there...

I tried again with freebsd 12.4 but I'm getting the same results.

Here is the output of those sysctl commands

Code:

root@:~ # sysctl sys | grep mlxroot@:~ # sysctl dev | grep mlxdev.mlx4_core.%parent:

The dmesg errors are the same mlx4_core0: Failed to override log_pg_sz parameter and device_attach: mlx4_core0 attach returned 5

Code:

root@:~ # ifconfig | grep '^\w'igc0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500igc1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500igc2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384root@:~ # dmesg | grep -i mlxmlx5en: Mellanox Ethernet driver 3.6.0 (December 2020)root@:~ # kldload -v mlx4enLoaded mlx4en, id=7root@:~ # dmesg | grep -i mlxmlx5en: Mellanox Ethernet driver 3.6.0 (December 2020)mlx4_core0: <mlx4_core> mem 0x7fd00000-0x7fdfffff,0x6000000000-0x60007fffff at device 0.0 on pci5mlx4_core: Mellanox ConnectX core driver v3.6.0 (December 2020)mlx4_core: Initializing mlx4_coremlx4_core0: command 0x34 timed out (go bit not cleared)mlx4_core0: device is going to be resetmlx4_core0: device was reset successfullymlx4_core0: Failed to override log_pg_sz parametermlx4_core0: Failed to init fw, aborting.device_attach: mlx4_core0 attach returned 5root@:~ # freebsd-version12.4-RELEASE

S

sko

Active Member
Jun 11, 2021
265
137
43

Nov 3, 2023

  • #11

sketch said:

mlx4_core0: command 0x34 timed out (go bit not cleared)

According to a bunch of search results, this error points to MSI being deactivated...

Can you give me the output of 'sysctl -a | grep msi'?
MSI(-X) is not in some legacy-mode or disabled in BIOS?

And just to be clear: this is all bare metal? Not on some hypervisor and SR-IOV magic?

S

sketch

New Member
Nov 1, 2023
9
0
1

Nov 3, 2023

  • #12

sko said:

According to a bunch of search results, this error points to MSI being deactivated...

Can you give me the output of 'sysctl -a | grep msi'?


Code:

root@:~ # sysctl -a | grep msihw.ice.rdma_max_msix: 64hw.sdhci.enable_msi: 1hw.puc.msi_disable: 0hw.pci.honor_msi_blacklist: 1hw.pci.msix_rewrite_table: 0hw.pci.enable_msix: 1hw.pci.enable_msi: 1hw.mfi.msi: 1hw.malo.pci.msi_disable: 0hw.ix.enable_msix: 1hw.bce.msi_enable: 1hw.aac.enable_msi: 1machdep.disable_msix_migration: 0machdep.num_msi_irqs: 2048dev.igc.2.iflib.disable_msix: 0dev.igc.1.iflib.disable_msix: 0dev.igc.0.iflib.disable_msix: 0compat.linuxkpi.mlx4_msi_x: 1

sko said:

MSI(-X) is not in some legacy-mode or disabled in BIOS?

Not that I can tell but I didn't see anything under that name MSI/MSI-X in the BIOS that I could tell.
Could it be under some other name?

sko said:

And just to be clear: this is all bare metal? Not on some hypervisor and SR-IOV magic?

yeah. baremetal and SR-IOV is disabled in the BIOS too.

S

sketch

New Member
Nov 1, 2023
9
0
1

Nov 3, 2023

  • #13

I tried looking at the mlx4 source but I’m not sure what to make of this.

Code:

mlx4_cfg.log_pg_sz_m = 1;mlx4_cfg.log_pg_sz = 0;err = mlx4_MOD_STAT_CFG(dev, &mlx4_cfg);if (err) mlx4_warn(dev, "Failed to override log_pg_sz parameter\n");

Code:

/* Attempt to access reserved or unallocaterd resource: */ CMD_STAT_BAD_RESOURCE = 0x05,

S

sketch

New Member
Nov 1, 2023
9
0
1

Nov 3, 2023

  • #14

I compared the lspci output on the r86s to the output on the other machine I have with a working CX3 cards.

The r86s shows:

Code:

Capabilities: [9c] MSI-X: Enable- Count=128 Masked-

On the working machines it's: MSI-X: Enable+ which seems to point at MSI like sko suggested.

I've been looking through the BIOS to see is there's a setting but I haven't found anything yet.
I also tried reseting the BIOS to the defaults but still nothing.

J

jmilleriec

New Member
Feb 24, 2024
1
0
1

Feb 24, 2024

  • #15

Did you ever solve this problem? I'm having the same problem, the SFP interfaces are not showing up under opnsense or vyos...

C

cloudhax

New Member
Feb 29, 2024
18
19
3

Apr 9, 2024

  • #16

yikes, I have the same problem here

C

cloudhax

New Member
Feb 29, 2024
18
19
3

Apr 11, 2024

  • #17

I've found that my problem seems intermittent, I seem to semi-reliably see the SFP+ ports normally on the first boot after plugging in the usb-c power input. subsequent reboots the ports are just missing like they don't power up. I'll try another power adapter or just hope it never needs to reboot R86s U2 SFP+ interfaces not showing up (2)

Show hidden low quality content

You must log in or register to reply here.

R86s U2 SFP+ interfaces not showing up (2024)
Top Articles
Latest Posts
Article information

Author: Melvina Ondricka

Last Updated:

Views: 5281

Rating: 4.8 / 5 (68 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Melvina Ondricka

Birthday: 2000-12-23

Address: Suite 382 139 Shaniqua Locks, Paulaborough, UT 90498

Phone: +636383657021

Job: Dynamic Government Specialist

Hobby: Kite flying, Watching movies, Knitting, Model building, Reading, Wood carving, Paintball

Introduction: My name is Melvina Ondricka, I am a helpful, fancy, friendly, innocent, outstanding, courageous, thoughtful person who loves writing and wants to share my knowledge and understanding with you.