Re: lost interrupts when running sabrelite images (v4.15+) in qemu

From: Guenter Roeck
Date: Tue Mar 06 2018 - 09:25:25 EST


On 03/05/2018 09:30 AM, Troy Kisky wrote:
On 3/3/2018 1:12 PM, Guenter Roeck wrote:
On 03/03/2018 12:48 PM, Guenter Roeck wrote:
On 03/03/2018 11:07 AM, Troy Kisky wrote:
On 3/3/2018 8:32 AM, Guenter Roeck wrote:
Hi,

since v4.15, I get the following runtime warning when running sabrelite images
in qemu.

irq 65: nobody cared (try booting with the "irqpoll" option)
...
handlers:
[<26292474>] fec_pps_interrupt
Disabling IRQ #65
fec 2188000.ethernet (unnamed net_device) (uninitialized): MDIO read timeout

Bisect points to commit 4ad1ceec05e491 ("net: fec: Let fec_ptp have its
own interrupt routine"). Analysis shows that platform_irq_count()
returns 2, which is reduced to 1 by fec_enet_get_irq_cnt().
If I let fec_enet_get_irq_cnt() return 2, the problem is gone.
Reverting commit 4ad1ceec05e491 also fixes the problem.

Bisect log is attached.


Sounds like you found a bug with qemu. I just booted sabrelite over nfs fine.
My interrupts look like this.


 64: 98767 0 0 0 GIC-0 150 Level 2188000.ethernet
 65: 0 0 0 0 GIC-0 151 Level 2188000.ethernet
___________
Irq 65 is only for ptp interrrupts now. If qemu is signaling an tx/rx frame interrupt on 65,
then qemu is wrong. Of course, I've never used qemu so feel free to ignore me if I make no sense.


Thanks for checking with real hardware.

This is what I see (with your patch reverted):

ÂÂ64:ÂÂÂÂÂÂÂÂÂ 0ÂÂÂÂ GIC-0 150 LevelÂÂÂÂ 2188000.ethernet
ÂÂ65:ÂÂÂÂÂÂÂÂ 64ÂÂÂÂ GIC-0 151 LevelÂÂÂÂ 2188000.ethernet

Looking into the qemu source, I see:

#define FSL_IMX6_ENET_MAC_1588_IRQ 118
#define FSL_IMX6_ENET_MAC_IRQ 119

FSL_IMX6_ENET_MAC_IRQ is then connected to fec interrupt index 0, and FSL_IMX6_ENET_MAC_1588_IRQ
is connected to fec interrupt index 1.

This may suggest that the defines are reversed. I'll see what happens if I swap them.


Confirmed. If I swap the above defines, everything works fine. At the same time,
the modified qemu works with older kernels.

Thanks a lot for the hint, and sorry for the noise.

Guenter

It definitely was not noise. I bet it helps people searching the mailing list in the future.
Thanks for posting the resolution.


Turns out "works" as I stated above is not entirely accurate.

- v4.13 and later work
- In v4.12 and earlier, the Ethernet interface fails to instantiate with
fec 2188000.ethernet (unnamed net_device) (uninitialized): MDIO read timeout
fec: probe of 2188000.ethernet failed with error -5
I have not found the reason yet. Unmodified qemu works fine.
- v4.1 and earlier crash. The crash is fixed by commit 32cba57ba74be ("net: fec:
introduce fec_ptp_stop and use in probe fail path")

There is also a matching bug at lauchpad:

https://bugs.launchpad.net/qemu/+bug/1753309

Guenter