Re: [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30__list_add+0x7d/0xad()

From: Patrick McHardy
Date: Wed Jun 17 2009 - 10:23:38 EST


Patrick McHardy wrote:
Eric Dumazet wrote:
Patrick McHardy a écrit :
No, before it is confirmed, its only visible to the CPU handling
the initial packet of a connection. Confirmation is the step that
makes it visible to other CPUs.

Thanks Patrick, I missed this, and your patch seems fine now :)

Thanks for your help, I'll send it to Dave later today.

I'm having some trouble figuring out the exact events that would
lead to the timer base corruption. Ingo, could you please test
this patch to make sure it also fixes the problem?


diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5f72b94..9b20e58 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -425,7 +425,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
/* Remove from unconfirmed list */
hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);

- __nf_conntrack_hash_insert(ct, hash, repl_hash);
/* Timer relative to confirmation time, not original
setting time, otherwise we'd get timer wrap in
weird delay cases. */
@@ -433,8 +432,16 @@ __nf_conntrack_confirm(struct sk_buff *skb)
add_timer(&ct->timeout);
atomic_inc(&ct->ct_general.use);
set_bit(IPS_CONFIRMED_BIT, &ct->status);
+
+ /* Since the lookup is lockless, hash insertion must be done after
+ * starting the timer and setting the CONFIRMED bit. The RCU barriers
+ * guarantee that no other CPU can find the conntrack before the above
+ * stores are visible.
+ */
+ __nf_conntrack_hash_insert(ct, hash, repl_hash);
NF_CT_STAT_INC(net, insert);
spin_unlock_bh(&nf_conntrack_lock);
+
help = nfct_help(ct);
if (help && help->helper)
nf_conntrack_event_cache(IPCT_HELPER, ct);