Re: [patch] Real-Time Preemption, -RT-2.6.10-rc2-mm3-V0.7.32-12

From: Ingo Molnar
Date: Fri Dec 10 2004 - 06:13:57 EST



* Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> Second, my ethernet doesn't work, and it really seems to be some kind
> of interrupt trouble. It sends out ARPs but doesn't see them come
> back, and it also doesn't seem to know that it sent them out. I get
> the following:
>
> NETDEV WATCHDOG: eth0: transmit timed out
> eth0: transmit timed out, tx_status 00 status e601.
> diagnostics: net 0ccc media 8880 dma 0000003a fifo 0000
> eth0: Interrupt posted but not delivered -- IRQ blocked by another
> device?
> Flags; bus-master 1, dirty 33(1) current 33(1)
> Transmit list 00000000 vs. f75012a0.
> 0: @f7501200 length 80000043 status 8c010043
> 1: @f75012a0 length 8000007a status 0c01007a
> 2: @f7501340 length 8000002a status 0001002a
> 3: @f75013e0 length 80000098 status 0c010098
> 4: @f7501480 length 8000002a status 0001002a
> 5: @f7501520 length 8000002a status 0001002a
> 6: @f75015c0 length 8000002a status 0001002a
> 7: @f7501660 length 8000002a status 0001002a
> 8: @f7501700 length 80000043 status 0c010043
> 9: @f75017a0 length 80000043 status 0c010043
> 10: @f7501840 length 8000004f status 0c01004f
> 11: @f75018e0 length 8000004f status 0c01004f
> 12: @f7501980 length 80000043 status 0c010043
> 13: @f7501a20 length 8000007a status 0c01007a
> 14: @f7501ac0 length 80000098 status 0c010098
> 15: @f7501b60 length 8000002a status 8001002a
>
> I have a (from lspci)
> 0000:02:08.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M
> [Tornado] (rev 78)
>
>
> I can look into this further to see what the problem is. One funny
> note, on the vanilla kernel, my eth0 is at interrupt 177, but on the
> rt patched kernel its swapped with the sound card and is at interrupt
> 169. Well it's getting too late for me now (its 1am my time (01:00 for
> you European folks ;-) , and I need to get up at 6:30 am). Tomorrow,
> I'll hack on it some more.

yeah, please check this - you are the first one to report this issue.

A good first step would be to go switch to a non-PREEMPT_RT preemption
model (but to keep the -RT codebase) and see whether the breakage is
related to that. If the breakage goes away with say PREEMPT_DESKTOP then
i'd suggest to enable PREEMPT_HARDIRQS and PREEMPT_SOFTIRQS (but keep
PREEMPT_DESKTOP) - do these alone trigger the breakage? E.g. there was
an obscure timing bug in the floppy driver that only triggered with
PREEMPT_HARDIRQS enabled. So it's not out of question that there's some
other driver bug/race in hiding. The other possibility is some generic
-RT kernel breakage - like the SLAB issue was.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/