Re: Linux 2.6.22-rc2

From: Mike Houston
Date: Wed May 23 2007 - 13:39:34 EST


On Tue, 22 May 2007 17:00:18 -0700 (PDT)
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> and the load off "sk->sk_prot->ioctl" oopses, because "sk->sk_prot"
> is corrupt and contains 0x8e3cad42, which is not a valid kernel
> pointer.
>
> The other oops is even worse.
>
> I also think it meshes with
>
> sky2 eth0: descriptor error q=0x280 get=285
> [800042375e2e5e] put=285
>
> and I suspect your memory got corrupted by sky2 reading the wrong
> descriptors, and overwriting kernel memory.
>
> So it's almost certainly some DMA problem. Now, _why_ you have DMA
> problems, I have no idea. But can you try:
> - disable CONFIG_PREEMPT
> - disable CONFIG_HIGHMEM if you have it on
> - just in general see if you can disable any kernel config options
> that might be unnecessary.
> to see if it changes the situation at all..

Thanks for looking at this. After further posts in the discussion I
wasn't sure if you still wanted me to try this, but I thought it
might be useful to see if (particularly) highmem support might change
the behaviour, or the messages in any way that might lead to a clue.
There was no change to the behaviour.

I have a Core 2 duo, and 2 Gb of RAM, but I built a uniprocessor
kernel (with apic), without highmem support, with no PREEMPT and
without other unnecessary stuff. If by chance I got it working, my
plan was to enable things one at a time.

I won't get that oops on this setup though (never have, anyways...
it was just the PCLinux install on that other hard disk which has
now been returned to use elsewhere), but the messages on trying to
transfer data are the same:

First try (instant failure on trying to ssh):

May 23 12:51:14 cramit kernel: sky2 eth0: enabling interface
May 23 12:51:14 cramit kernel: sky2 eth0: ram buffer 0K
May 23 12:51:16 cramit kernel: sky2 eth0: Link is up at 100 Mbps,
full duplex, flow control both May 23 12:51:34 cramit kernel: sky2
0000:04:00.0: error interrupt status=0x1 May 23 12:51:34 cramit
kernel: sky2 eth0: descriptor error q=0x280 get=7 [0] put=7

Second try after cold boot (failure on trying to transfer file):

May 23 12:52:59 cramit kernel: sky2 eth0: enabling interface
May 23 12:52:59 cramit kernel: sky2 eth0: ram buffer 0K
May 23 12:53:01 cramit kernel: sky2 eth0: Link is up at 100 Mbps,
full duplex, flow control both
May 23 12:55:40 cramit kernel: sky2
0000:04:00.0: error interrupt status=0x80000000
May 23 12:55:40 cramit kernel: sky2 eth0: hw error interrupt status
0x8
May 23 12:55:40 cramit kernel: sky2 eth0: MAC parity error

This is exactly the behaviour I've been seeing.

I still happen to have a Windows Vista install kicking around, so to
make sure we're not flogging a dead horse I booted that and let it
set up the yukon2 chip and I tested it. (more to make sure that
eeprom update didn't break it). I used it for a bit and successfully
transferred some large files from box running Samba. MS must be using
some specific workaround or something.

Mike Houston
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/