Re: Thread+network crashes in 2.0/2.1

Linus Torvalds (torvalds@transmeta.com)
Sat, 28 Feb 1998 12:39:25 -0800 (PST)


On Sat, 28 Feb 1998, Robert M. Fleischman wrote:
>
> By using our System.map and ksymoops.cc, that last crash looks like:

Ok, having decoded this a bit more, it look smore and more like a network
driver bug. I'm adding Donald to the Cc list..

> Code: c01b6b09 <boomerang_rx+209/3a0>
> Code: c01b6b09 <boomerang_rx+209/3a0> 8b ab 8c 00 00 movl 0x8c(%ebx),%ebp
> Code: c01b6b0f <boomerang_rx+20f/3a0> 8d 3c 2a leal (%edx,%ebp,1),%edi
> Code: c01b6b12 <boomerang_rx+212/3a0> 89 bb 8c 00 00 movl %edi,0x8c(%ebx)
> Code: c01b6b1e <boomerang_rx+21e/3a0> 01 53 5c addl %edx,0x5c(%ebx)
> Code: c01b6b21 <boomerang_rx+221/3a0> 8b 83 00 90 90 movl 0x90909000(%ebx),%eax

The above is part of the following code:

/* Pass up the skbuff already on the Rx ring. */
skb = vp->rx_skbuff[entry];
vp->rx_skbuff[entry] = NULL;
temp = skb_put(skb, pkt_len);

and in particular it seems that "skb" itself is NULL: the driver does no
sanity checking at all on whether the rx_skbuff[] array got depleted, so
when you see heavy network traffic it will go through the whole array and
start using NULL pointers.

That's what it very much looks like, at least: it tries to avoid doing
this by having this "rx_work_limit" thing, but it is obviously not working
(at the very least there seems to be a off-by-one error, at worst the
whole idea is broken because once all the skb's are dirty the read routine
will never be able to do any work at all).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu