Re: System freeze on reboot - general protection fault

From: Zdenek Kabelac
Date: Tue Aug 11 2009 - 17:10:44 EST


2009/8/11 Robin Holt <holt@xxxxxxx>:
> On Tue, Aug 11, 2009 at 05:32:16PM +0200, Zdenek Kabelac wrote:
>> 2009/8/11 Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx>:
>> > On Tue, 11 Aug 2009, Zdenek Kabelac wrote:
>> >
>> >> Well - I've tried to switch from  'slub' allocator to  old 'slab'
>> >> allocator and the problem is gone - shutdown goes without any problem.
>> >>
>> >> So it's probably related to 'slub' allocator only?
>> >
>> > The slab allocator does not have all the diagnostics of slub. The issue
>> > may simply not be detected in slab. If you switch off diagnostics in slub
>> > then everything will seem to work fine as well. But we need to figure out
>> > what is going wrong here.
>>
>> Hmm - but there are few things -
>>
>> My machine runs  Fedora Rawhide. If I run the same kernel within KVM
>> running Debian unstable I could easily reboot this guest machine
>> without any problems.
>>
>> Also if I boot Rawhide only to single mode - I could also reboot
>> machine without this oops.
>> The problem seems to be - when I do full machine startup to the
>> multiuser runlevel 3
>
> Try booting all the way and recording you output from lsmod.  Reboot
> single user mode and modprobe each of the modules in that original lists.
> Test shutdown from single user mode.  This might identify if it is one
> of your loaded modules.  If so, effectively bisect the modprobes until
> you find the offending module(s).
>


Ok - it appeared to be more complex - when I've been trying to get
this oops on my laptop while not being connected via wired net - but
to not bother here with details - the result is

That if I remove nf_conntrack_ipv4.ko - so it can not be loaded - the
problem is gone.
So it looks like the memory problem is related to netfiltering - there
are multiple modules loaded as dependecy becuase of this - so it's
hard to say exactly which module of them makes the trouble.

I've checked for some recent commits in this area - and they seem to
be actually important
(i.e 941297f443f871b8c3372feccf27a8733f6ce9e9 16.Jul)

I could probably try to revert some of them - but if someone has some
ideas what could make these problems ?

I've added authors of some recent conntrack commits to Cc: - maybe
they might know?

Zdenek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/