Re: 2.6.25-git2: BUG: unable to handle kernel paging request atffffffffffffffff

From: Linus Torvalds
Date: Mon Apr 21 2008 - 13:49:44 EST




On Mon, 21 Apr 2008, Jiri Slaby wrote:
>
> BTW. I haven't see this without suspend/resume cycle, do you, Rafael? It
> doesn't mean anything, since it needs longer time to trigger, but anyway, it
> might be a clue.

There's a separate (and very different-looking) bug-report about the atl1
driver having problems when doing an "ifconfig down" on it. In fact, the
problem report says:

> With this commit in tree, I can reproduce either
> a) kmalloc-2048 corruption after initscripts shutdown eth0
> http://marc.info/?l=linux-kernel&m=120820360221261&w=2
>
> b) or oopses at filp_close() first reported long ago
> (sorry, can't find that email)

where that "or oopses at filp_close()" thing is somewhat interesting,
since your original bug was about something that looked like file pointer
corruption.

Now, I doubt you have an ATL chip, and I doubt the two are _really_
related in any way (the ATL bug was actually triggered by enabling 64-bit
DMA), but the filp_close thing makes me go "hmm".

The two affected corrupted SLUB areas were the 2kB allocation (1560-byte
ethernet packets plus skb_shared_info overhead, anyone?) and apparently
the one that filp's are in (perhaps a 20-byte TCP ACK packet or other
"small" packet + the skb_shared_info overhead would be a common case that
might be in that 200-byte range?)

Maybe the ATL bug isn't ATL-specific at all, but somehow connected to
NETIF_F_HIGHDMA. Do you have 4GB+ of RAM?

And one thing that suspend/resume does, which is not necessarily commonly
done during normal operation, is that ifconfig down/up pattern. Maybe
there is something broken in general there?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/