Re: System freeze on reboot - general protection fault

From: Paul E. McKenney
Date: Thu Sep 03 2009 - 14:17:23 EST


On Thu, Sep 03, 2009 at 12:17:43AM +0200, Eric Dumazet wrote:
> Zdenek Kabelac a écrit :
> > 2009/8/17 Patrick McHardy <kaber@xxxxxxxxx>:
> >> Eric Dumazet wrote:
> >>> Zdenek Kabelac a écrit :
> >>>> [<ffffffffa02c502f>] nf_conntrack_ftp_fini+0x2f/0x70 [nf_conntrack_ftp]
> >>>> [<ffffffff8027bcc5>] sys_delete_module+0x1a5/0x270
> >>>> [<ffffffff8020d329>] ? retint_swapgs+0xe/0x13
> >>>> [<ffffffff80271bf2>] ? trace_hardirqs_on_caller+0x162/0x1b0
> >>>> [<ffffffff80292121>] ? audit_syscall_entry+0x191/0x1c0
> >>>> [<ffffffff80526dae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> >>>> [<ffffffff8020c84b>] system_call_fastpath+0x16/0x1b
> >>>> Code: c6 00 00 0f 82 66 ff ff ff 49 8b 9e d8 05 00 00 48 85 db 75 16
> >>>> e9 8e 00 00 00 0f 1f 44 00 00 48 85 c0 0f 84 80 00 00 00 48 89 c3 <0f>
> >>>> b6 4b 37 48 8b 03 48 8d 14 cd 00 00 00 00 0f 18 08 48 29 ca
> >>>> RIP [<ffffffffa02b2c2c>] nf_conntrack_helper_unregister+0x16c/0x320
> >>>> [nf_conntrack]
> >>>> RSP <ffff88013982fe68>
> >>>> CR2: 0000000000000038
> >>>> ---[ end trace bc3a0ede3d0084db ]---
> >>>>
> >>> I am currently traveling and wont be able to help you before next week.
> >>>
> >>> I added netdev, Patrick, and netfilter-devel in CC so that more eyes can take a look.
> >> Thanks for the report, I'll have a look at this. Zdenek, please
> >> send me the nf_conntrack.ko file used in the above oops. Thanks.
> >>
> >
> > Ok
> >
> > I've found the solution for my problem.
> >
> > http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/30483
> >
> > I've made this small fix from this thread:
> >
> > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core
> > index b5869b9..68488f8 100644
> > --- a/net/netfilter/nf_conntrack_core.c
> > +++ b/net/netfilter/nf_conntrack_core.c
> > @@ -1108,6 +1108,7 @@ static void nf_conntrack_cleanup_init_net(void)
> > {
> > nf_conntrack_helper_fini();
> > nf_conntrack_proto_fini();
> > + rcu_barrier();
> > kmem_cache_destroy(nf_conntrack_cachep);
> > }
> >
> > @@ -1266,7 +1267,7 @@ static int nf_conntrack_init_init_net(void)
> >
> > nf_conntrack_cachep = kmem_cache_create("nf_conntrack",
> > sizeof(struct nf_conn),
> > - 0, SLAB_DESTROY_BY_RCU, NULL);
> > + 0, 0, NULL);
> > if (!nf_conntrack_cachep) {
> > printk(KERN_ERR "Unable to create nf_conn slab cache\n");
> > ret = -ENOMEM;
> >
> >
> > As the thread nf_conntrack: Use rcu_barrier() and fix kmem_cache_create flags
> > seems to be samewhat 'unfinished' and already a bit old and I've no
> > idea whether it actually fixes problem completely or just hides it in
> > my case - I'm leaving it to some RCU gurus to fix this issue.
> >
> > All I could say is - this this extra rcu_barrier() and removal of
> > SLAB_DESTROY removes my GPF on reboot.
> >
> > Zdenek
>
> Ouch..
>
> Dont think such a patch makes your kernel better, it'll crash too.
>
> You cannot remove SLAB_DESTROY_BY_RCU like this, it's there for very good reasons.

And if I understand correctly, this is more evidence that
kmem_cache_destroy() needs to do an rcu_barrier() in the
SLAB_DESTROY_BY_RCU case.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/