Re: 2.4.20-pre10aa1 oops report (was Re: Linux-2.4.20-pre8-aa2 oops report. [solved])

From: harisri (harisri@telstra.com)
Date: Mon Oct 14 2002 - 20:08:31 EST


Hello Andrea,
 
> this smells like a problem with one of your modules. Please make 100%
> sure you use exactly the same .config for both 2.4.20pre10 and
> 2.4.20pre10aa1 and please try to find which is the module that is
> crashing the kernel after it's being loaded. Expect always different
> kind of crashes and oopses. You can also try to turn on the slab
> debugging option in the kernel hacking menu.

Yes I am using the same .config file from 2.4.20-pre10 on
2.4.20-pre10aa1 (of course I run make oldconfig, and accept the default
setting that shows up on 2.4.20-pre10aa1)

I think you are right, it has something to do with the kernel modules.

> > Code; c01e55e2 <fast_clear_page+12/50>
>
> you also may want to configure the kernel as i686 instead of K7 so
> fast_clear_page won't be used to see if it makes any difference.

Ok. That didn't really help. Kernel compiled for i386 even crashes, but
the k7 optimised kernel crashes at the Athlon speed :-)
 
> the place where the oops happens is most certainly not the problem,
> either something is wrong with fast_clear_page for whatever hardware
> reason, or more likely the moduled by modprobe is corrupting the
> freelist and alloc_pages returned garbage.
>
> btw, how much memory do you have? If you've more than 800M it
> could be a
> broken driver using pte_offset by hand, try to reproduce with mem=800m
> in such case. To fix this you should find which is the module that is
> destabilizing the kernel.

My computer has 512 MB RAM. No highmem.

I am able to trigger the issue (after 3 attempts [1]) with,
CONFIG_AGP m
CONFIG_AGP_AMD y
CONFIG_DRM y
CONFIG_DRM_RADEON m

While I couldn't trigger the issue (after 5 attempts [1]) without them.
Hence I suspect it may be something to do with them. But it takes a lot
of time to test these all, I think I will have good answers in couple of
days time considering the amount of time it takes to perform the tests.

[1]
1. Login to XFree86/Gnome
2. Start Mozilla, Evolution, OpenOffice Writer/Calc/Impress, Konqueror,
KMail. And exit them all.
3. mke2fs -j /dev/hdc9; mount /dev/hdc9 /test;cd /test;dd if=/dev/zero
of=zero bs=1024 count=2097152;cd /
4. Redo the step 2
5. Log out and log in and redo step 2
6. Unmount /test

Repeat the above test cycle few times (on 3rd attempt or so) the system
oops (when I had AGP/AMD/DRM/Radeon stuff).

Thanks for your help.

Hari
harisri@bigpond.com
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Oct 15 2002 - 22:00:53 EST