Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iPXE on Synology NAS ds1515+ - NICs link down. #1188

Open
hetii opened this issue Apr 8, 2024 · 6 comments
Open

iPXE on Synology NAS ds1515+ - NICs link down. #1188

hetii opened this issue Apr 8, 2024 · 6 comments

Comments

@hetii
Copy link

hetii commented Apr 8, 2024

Hi.
As I test ipxe on Synology NAS ds1515+ I notice NICs issue.

I compile my ipxe by:
make bin-x86_64-efi/ipxe.efi DEBUG=intel,intelx,netdevice EMBED=SynoBootLoader.ipxe

First is about networking cards.
All my interfaces claim that all links are in down state (all leds in RJ45 sockets are off):

 iPXE initialising devices...
INTEL 0x79a05a58 MAC+PHY reset (08100241/80382780 was 08100241/80382780)
INTEL 0x79a05a58 has autoloaded MAC address 00:a0:c9:00:00:00
NETDEV net0 registered (phys 0000:00:14.0 hwaddr 00:a0:c9:00:00:00)
INTEL 0x79a05a58 link status is 80382780
NETDEV net0 link is down: Down (https://ipxe.org/38086193)
INTEL 0x79a067b8 MAC+PHY reset (08100241/80382784 was 08100241/80382784)
INTEL 0x79a067b8 has autoloaded MAC address 00:a0:c9:00:00:01
NETDEV net1 registered (phys 0000:00:14.1 hwaddr 00:a0:c9:00:00:01)
INTEL 0x79a067b8 link status is 80382784
NETDEV net1 link is down: Down (https://ipxe.org/38086193)
INTEL 0x79a08598 MAC+PHY reset (08100241/80382788 was 08100241/80382788)
INTEL 0x79a08598 has autoloaded MAC address 00:a0:c9:00:00:02
NETDEV net2 registered (phys 0000:00:14.2 hwaddr 00:a0:c9:00:00:02)
INTEL 0x79a08598 link status is 80382788
NETDEV net2 link is down: Down (https://ipxe.org/38086193)
INTEL 0x79a0a418 MAC+PHY reset (08100241/8038278c was 08100241/8038278c)
INTEL 0x79a0a418 has autoloaded MAC address 00:a0:c9:00:00:03
NETDEV net3 registered (phys 0000:00:14.3 hwaddr 00:a0:c9:00:00:03)
INTEL 0x79a0a418 link status is 8038278c
NETDEV net3 link is down: Down (https://ipxe.org/38086193)
ok

Opening interfaces don't change anything:

iPXE> ifopen net0
NETDEV net0 opening
INTEL 0x79a05a58 ring 03800 is at [7b749000,7b749100)
INTEL 0x79a05a58 ring 02800 is at [7b748000,7b748100)
INTEL 0x79a05a58 link status is 80382780
iPXE> INTEL 0x79a05a58 link status is 80382780

iPXE> ifopen net1
NETDEV net1 opening
INTEL 0x79a067b8 ring 03800 is at [7b747000,7b747100)
INTEL 0x79a067b8 ring 02800 is at [7b746000,7b746100)
INTEL 0x79a067b8 link status is 80382784
iPXE> INTEL 0x79a067b8 link status is 80382784
ifopen net2
NETDEV net2 opening
INTEL 0x79a08598 ring 03800 is at [7b745000,7b745100)
INTEL 0x79a08598 ring 02800 is at [7b744000,7b744100)
INTEL 0x79a08598 link status is 80382788
iPXE> INTEL 0x79a08598 link status is 80382788
ifopen net3
NETDEV net3 opening
INTEL 0x79a0a418 ring 03800 is at [7b743000,7b743100)
INTEL 0x79a0a418 ring 02800 is at [7b742000,7b742100)
INTEL 0x79a0a418 link status is 8038278c
iPXE> INTEL 0x79a0a418 link status is 8038278c
ifstat
net0: 00:a0:c9:00:00:00 using i354 on 0000:00:14.0 (Ethernet) [open]
  [Link:down, TX:0 TXE:0 RX:0 RXE:0]
  [Link status: Down (https://ipxe.org/38086193)]
net1: 00:a0:c9:00:00:01 using i354 on 0000:00:14.1 (Ethernet) [open]
  [Link:down, TX:0 TXE:0 RX:0 RXE:0]
  [Link status: Down (https://ipxe.org/38086193)]
net2: 00:a0:c9:00:00:02 using i354 on 0000:00:14.2 (Ethernet) [open]
  [Link:down, TX:0 TXE:0 RX:0 RXE:0]
  [Link status: Down (https://ipxe.org/38086193)]
net3: 00:a0:c9:00:00:03 using i354 on 0000:00:14.3 (Ethernet) [open]
  [Link:down, TX:0 TXE:0 RX:0 RXE:0]
  [Link status: Down (https://ipxe.org/38086193)]

We can see the MACa are OUIs one, but even when I set correct one nothing change.

On other hand I compile pure kernel 6.8.4 and load igb drivers and all my NICs works fine,
so this seams to be issue in intel driver implementation inside ipxe:

/lib/modules # insmod dca.ko
[   78.060729] dca: module verification failed: signature and/or required key missing - tainting kernel
[   78.071692] dca service started, version 1.12.1
/lib/modules # insmod  i2c-algo-bit.ko
/lib/modules # insmod igb.ko
[   92.618261] igb: Intel(R) Gigabit Ethernet Network Driver
[   92.624310] igb: Copyright (c) 2007-2014 Intel Corporation.
[   92.986075] igb 0000:00:14.0: added PHC on eth0
[   92.991166] igb 0000:00:14.0: Intel(R) Gigabit Ethernet Network Connection
[   92.998927] igb 0000:00:14.0: eth0: PBA No: 002100-000
[   93.004675] igb 0000:00:14.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
[   93.370011] igb 0000:00:14.1: added PHC on eth1
[   93.375091] igb 0000:00:14.1: Intel(R) Gigabit Ethernet Network Connection
[   93.382852] igb 0000:00:14.1: eth1: PBA No: 002100-000
[   93.388599] igb 0000:00:14.1: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
[   93.754345] igb 0000:00:14.2: added PHC on eth2
[   93.759448] igb 0000:00:14.2: Intel(R) Gigabit Ethernet Network Connection
[   93.767223] igb 0000:00:14.2: eth2: PBA No: 002100-000
[   93.772983] igb 0000:00:14.2: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
[   94.138483] igb 0000:00:14.3: added PHC on eth3
[   94.143581] igb 0000:00:14.3: Intel(R) Gigabit Ethernet Network Connection
[   94.151350] igb 0000:00:14.3: eth3: PBA No: 002100-000
[   94.157110] igb 0000:00:14.3: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s)
/lib/modules # ip link set dev eth0 up
/lib/modules # [  130.605793] igb 0000:00:14.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

Now leds on eth0 are on.

I can analyze kernel drivers vs ipxe driver but if you had any tips what I should check first on ipxe side let me know.

B.R.

@mcb30
Copy link
Member

mcb30 commented Apr 8, 2024

Try connecting a pair of ports on the NIC directly with a cable (i.e. without any network switch in between) and use the lotest command (enabled via #define LOTEST_CMD in config/general.h) to send packets from one port to another. This should eliminate any possible issues related to your specific network setup, as a starting point.

@hetii
Copy link
Author

hetii commented Apr 8, 2024

@mcb30 HI.

I checked different wires configuration and NICs without success:

I used normal cable not a crossover one if that matter.

Btw as I wrote before under linux all works fine, so this should not be related to my switches or wires.


 Type 'exit' to get the back to the menu
iPXE> ifstat
net0: 00:a0:c9:00:00:00 using i354 on 0000:00:14.0 (Ethernet) [closed]
  [Link:down, TX:0 TXE:0 RX:0 RXE:0]
  [Link status: Down (https://ipxe.org/38086193)]
net1: 00:a0:c9:00:00:01 using i354 on 0000:00:14.1 (Ethernet) [closed]
  [Link:down, TX:0 TXE:0 RX:0 RXE:0]
  [Link status: Down (https://ipxe.org/38086193)]
net2: 00:a0:c9:00:00:02 using i354 on 0000:00:14.2 (Ethernet) [closed]
  [Link:down, TX:0 TXE:0 RX:0 RXE:0]
  [Link status: Down (https://ipxe.org/38086193)]
net3: 00:a0:c9:00:00:03 using i354 on 0000:00:14.3 (Ethernet) [closed]
  [Link:down, TX:0 TXE:0 RX:0 RXE:0]
  [Link status: Down (https://ipxe.org/38086193)]
iPXE> ifopen net0
NETDEV net0 opening
INTEL 0x79a05a58 ring 03800 is at [7b748000,7b748100)
INTEL 0x79a05a58 ring 02800 is at [7b747000,7b747100)
INTEL 0x79a05a58 link status is 80382780
iPXE> INTEL 0x79a05a58 link status is 80382780
ifopen net1
NETDEV net1 opening
INTEL 0x79a067b8 ring 03800 is at [7b746000,7b746100)
INTEL 0x79a067b8 ring 02800 is at [7b745000,7b745100)
INTEL 0x79a067b8 link status is 80382784
iPXE> INTEL 0x79a067b8 link status is 80382784
lotest
Usage:

  lotest [-m|--mtu <mtu>] [-b|--broadcast] <sending interface> <receiving interface>

See https://ipxe.org/cmd/lotest for further information
iPXE> lotest net0 net1
Waiting for link-up on net0......................................................... Operation canceled (https://ipxe.org/0b072095)
Test failed: Operation canceled (https://ipxe.org/0b072095)
iPXE> lotest net2 net3
NETDEV net2 opening
INTEL 0x79a08598 ring 03800 is at [7b744000,7b744100)
INTEL 0x79a08598 ring 02800 is at [7b743000,7b743100)
INTEL 0x79a08598 link status is 80382788
NETDEV net3 opening
INTEL 0x79a0a418 ring 03800 is at [7b742000,7b742100)
INTEL 0x79a0a418 ring 02800 is at [7b741000,7b741100)
INTEL 0x79a0a418 link status is 8038278c
INTEL 0x79a08598 link status is 80382788
Waiting for link-up on net2...INTEL 0x79a0a418 link status is 8038278c
........................... Operation canceled (https://ipxe.org/0b072095)
Test failed: Operation canceled (https://ipxe.org/0b072095)
            
iPXE> lotest net0 net1
Waiting for link-up on net0.................... Operation canceled (https://ipxe.org/0b072095)
Test failed: Operation canceled (https://ipxe.org/0b072095)
iPXE> lotest net1 net2
Waiting for link-up on net1........................ Operation canceled (https://ipxe.org/0b072095)
Test failed: Operation canceled (https://ipxe.org/0b072095)
iPXE> lotest net2 net3
Waiting for link-up on net2............................. Operation canceled (https://ipxe.org/0b072095)
Test failed: Operation canceled (https://ipxe.org/0b072095)

Additionally I play with src/drivers/net/intel.c and drivers/net/intelx.c
where for intel_poll() I added recheck by intel_check_link ( netdev );
obraz
and even modify intel_check_link() to always set netdev_link_up ( netdev );
like:
obraz
but even when I open device and try get ip the link is dead:
obraz

I think that something is not ok in lower PHY layer and the NICs are not initialized properly or something missing in the driver.
Also the intel driver use some timers to probe links and maybe it's some timing issue in sequence of checking PHY state.

I added more debug flags to have more verbose output:

make bin-x86_64-efi/ipxe.efi DEBUG=intel,intelx,netdevice,serial,console,efi_utils,snp,nii,snponly,snpnet,efi_driver,efi_init,efi_pci,pci,ice EMBED=SynoBootLoader.ipxe -j 32

Budrate: 115200, lcr: 3, dlm: 0, dll: 1 
iPXE initialising devices...EFIDRV connecting our drivers
EFIPCI 0000:00:00.0 (8086:1f0c class 060000) has no driver
EFIPCI 0000:00:01.0 type 01 is not type 00
EFIPCI 0000:00:02.0 type 01 is not type 00
EFIPCI 0000:02:00.0 (1095:3132 class 010600) has no driver
EFIPCI 0000:00:03.0 type 01 is not type 00
EFIPCI 0000:00:04.0 type 01 is not type 00
EFIPCI 0000:04:00.0 (1b6f:7052 class 0c0330) has no driver
EFIPCI 0000:00:0e.0 (8086:1f14 class 060000) has no driver
EFIPCI 0000:00:0f.0 (8086:1f16 class 080600) has no driver
EFIPCI 0000:00:13.0 (8086:1f15 class 088000) has no driver
EFIPCI 0000:00:14.0 (8086:1f41 class 020000) has driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x0) has driver "PCI"
EFIDRV PciRoot(0x0)/Pci(0x14,0x0) disconnecting existing drivers
EFIDRV PciRoot(0x0)/Pci(0x14,0x0) connecting new drivers
EFIPCI 0000:00:14.0 (8086:1f41 class 020000) has driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x0) has driver "PCI"
EFIDRV PciRoot(0x0)/Pci(0x14,0x0) DRIVER_START
EFIPCI 0000:00:14.0 (8086:1f41 class 020000) has driver "i354"
0000:00:14.0 (8086:1f41) has driver "i354"
0000:00:14.0 has mem 80280000 io 20c0 irq 255
0000:00:14.0 latency timer is unreasonably low at 0. Setting to 32.
INTEL 0x79a05a58 MAC+PHY reset (08100241/80382780 was 08100241/80382780)
INTEL 0x79a05a58 has autoloaded MAC address 00:a0:c9:00:00:00
NETDEV net0 registered (phys 0000:00:14.0 hwaddr 00:a0:c9:00:00:00)
aaaINTEL 0x79a05a58 link status is 80382780
NETDEV net0 link is down: Down (https://ipxe.org/38086193)
EFIPCI 0000:00:14.0 using driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x0) using driver "PCI"
EFIPCI 0000:00:14.1 (8086:1f41 class 020000) has driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x1) has driver "PCI"
EFIDRV PciRoot(0x0)/Pci(0x14,0x1) disconnecting existing drivers
EFIDRV PciRoot(0x0)/Pci(0x14,0x1) connecting new drivers
EFIPCI 0000:00:14.1 (8086:1f41 class 020000) has driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x1) has driver "PCI"
EFIDRV PciRoot(0x0)/Pci(0x14,0x1) DRIVER_START
EFIPCI 0000:00:14.1 (8086:1f41 class 020000) has driver "i354"
0000:00:14.1 (8086:1f41) has driver "i354"
0000:00:14.1 has mem 802a0000 io 20a0 irq 255
0000:00:14.1 latency timer is unreasonably low at 0. Setting to 32.
INTEL 0x79a067b8 MAC+PHY reset (08100241/80382784 was 08100241/80382784)
INTEL 0x79a067b8 has autoloaded MAC address 00:a0:c9:00:00:01
NETDEV net1 registered (phys 0000:00:14.1 hwaddr 00:a0:c9:00:00:01)
aaaINTEL 0x79a067b8 link status is 80382784
NETDEV net1 link is down: Down (https://ipxe.org/38086193)
EFIPCI 0000:00:14.1 using driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x1) using driver "PCI"
EFIPCI 0000:00:14.2 (8086:1f41 class 020000) has driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x2) has driver "PCI"
EFIDRV PciRoot(0x0)/Pci(0x14,0x2) disconnecting existing drivers
EFIDRV PciRoot(0x0)/Pci(0x14,0x2) connecting new drivers
EFIPCI 0000:00:14.2 (8086:1f41 class 020000) has driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x2) has driver "PCI"
EFIDRV PciRoot(0x0)/Pci(0x14,0x2) DRIVER_START
EFIPCI 0000:00:14.2 (8086:1f41 class 020000) has driver "i354"
0000:00:14.2 (8086:1f41) has driver "i354"
0000:00:14.2 has mem 802c0000 io 2080 irq 255
0000:00:14.2 latency timer is unreasonably low at 0. Setting to 32.
INTEL 0x79a08598 MAC+PHY reset (08100241/80382788 was 08100241/80382788)
INTEL 0x79a08598 has autoloaded MAC address 00:a0:c9:00:00:02
NETDEV net2 registered (phys 0000:00:14.2 hwaddr 00:a0:c9:00:00:02)
aaaINTEL 0x79a08598 link status is 80382788
NETDEV net2 link is down: Down (https://ipxe.org/38086193)
EFIPCI 0000:00:14.2 using driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x2) using driver "PCI"
EFIPCI 0000:00:14.3 (8086:1f41 class 020000) has driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x3) has driver "PCI"
EFIDRV PciRoot(0x0)/Pci(0x14,0x3) disconnecting existing drivers
EFIDRV PciRoot(0x0)/Pci(0x14,0x3) connecting new drivers
EFIPCI 0000:00:14.3 (8086:1f41 class 020000) has driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x3) has driver "PCI"
EFIDRV PciRoot(0x0)/Pci(0x14,0x3) DRIVER_START
EFIPCI 0000:00:14.3 (8086:1f41 class 020000) has driver "i354"
0000:00:14.3 (8086:1f41) has driver "i354"
0000:00:14.3 has mem 802e0000 io 2060 irq 255
0000:00:14.3 latency timer is unreasonably low at 0. Setting to 32.
INTEL 0x79a0a418 MAC+PHY reset (08100241/8038278c was 08100241/8038278c)
INTEL 0x79a0a418 has autoloaded MAC address 00:a0:c9:00:00:03
NETDEV net3 registered (phys 0000:00:14.3 hwaddr 00:a0:c9:00:00:03)
aaaINTEL 0x79a0a418 link status is 8038278c
NETDEV net3 link is down: Down (https://ipxe.org/38086193)
EFIPCI 0000:00:14.3 using driver "i354"
EFIDRV PciRoot(0x0)/Pci(0x14,0x3) using driver "PCI"
EFIPCI 0000:00:16.0 (8086:1f2c class 0c0320) has no driver
EFIPCI 0000:00:17.0 (8086:1f22 class 010601) has no driver
EFIPCI 0000:00:18.0 (8086:1f32 class 010601) has no driver
EFIPCI 0000:00:1f.0 (8086:1f38 class 060100) has no driver
EFIPCI 0000:00:1f.3 (8086:1f3c class 0c0500) has no driver
ok

I found on network such issues for this card when auto-negotiation is off:
What-to-do-for-i354-link-to-come-up-It-is-not-coming-up-whereas

i354-autoneg-off-link-issue

Let me know if I can do anything to solve it.

@stappersg
Copy link
Contributor

Let me know if I can do anything to solve it.

  • Write more documentation about the project
  • Make a merge request that fixes #1189
  • Keep that MR in good condition
  • Enable others to join the quest

@hetii
Copy link
Author

hetii commented Apr 12, 2024

@stappersg

The goal of this project is to be able to run iPXE on mentioned NAS and boot experimental systems that I want to prepare for it.

The given solution by me for #1189 is wrong, so make no sense to prepare merge request for broken code.

I mean, it works as a workaround to be able to have input, but should not be considered as a solution, as it block main thread and the main root cause why bit UART_LSR_DR is not set most of the time, is still unknown for me.

Current task about NIC is another story, as you can see from logs the card is detected but somehow link is not up.

I try already different settings about autospeed detection but I have very limited knowledge about writing such complex drivers.

Idea came to my mind, to strip linux igb driver (as it works) to very basic code were my NICs report link to be up and try to port that part to ipxe, but not sure if/when I be able to do it.

If we have someone onboard who already know intel drivers and know what can be check regarding registers and initialization part of it, then I be more then happy to spend some hours to check things up...

@mcb30
Copy link
Member

mcb30 commented Apr 15, 2024

@hetii I was going to suggest trying the INTEL_NO_PHY_RST flag, but your original post shows the debug message

INTEL 0x79a05a58 MAC+PHY reset (08100241/80382780 was 08100241/80382780)

which indicates that the link was down when iPXE started execution (i.e. it's not the reset that's causing the link to drop).

I'd suggest checking the Linux driver to see what it does differently between i350 and i354.

Are the NIC ports connected to completely standard RJ-54 Gigabit Ethernet switch ports with autonegotiation enabled, or is there some kind of unusual network setup?

@hetii
Copy link
Author

hetii commented Apr 15, 2024

@mcb30 Hi,
I have a standard router Zbtlink ZBT-WE1326 connected to nic0 and in addition nic2 + nic3 are connected together, so all that interfaces goes up under linux without issue as soon as I call ip link set up on them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants