Re: dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]

From: Ingo Molnar
Date: Tue Feb 06 2007 - 11:48:59 EST



Mattia,

* Mattia Dongili <malattia@xxxxxxxx> wrote:

> > I have it halfways reproducible now and I'm working to find the root
> > cause. Thanks for providing the info.
>
> Great, I'm obviously available to test any patch :)

Could you try the patch below? The RCU serialization code (a rare call
but can be common in some types of setups) has a nasty implicit
dependency on the HZ tick - which until now was a hidden wart but became
an explicit bug under dynticks. Maybe this is what is slowing down your
box.

Ingo

------------------------->
Subject: [patch] dynticks: make sure synchronize_rcu() completes
From: Ingo Molnar <mingo@xxxxxxx>

synchronize_rcu() has a nasty implicit dependency on the HZ tick: it
relies on another CPU finishing all RCU work so that this CPU can finish
its RCU work too - in IRQ context. But wait_for_completion() goes to
sleep indefinitely on dynticks and there might be no other IRQs to this
CPU for a long time.

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
kernel/rcupdate.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

Index: linux/kernel/rcupdate.c
===================================================================
--- linux.orig/kernel/rcupdate.c
+++ linux/kernel/rcupdate.c
@@ -85,8 +85,13 @@ void synchronize_rcu(void)
/* Will wake me after RCU finished */
call_rcu(&rcu.head, wakeme_after_rcu);

- /* Wait for it */
- wait_for_completion(&rcu.completion);
+ /*
+ * Wait for it. Note: on dynticks RCU completion needs to be
+ * polled frequently, to make sure we finish work. If this CPU
+ * goes idle then another CPU cannot finish this CPU's work.
+ */
+ while (wait_for_completion_timeout(&rcu.completion, HZ/100 ? : 1) == 0)
+ /* nothing */;
}

static void rcu_barrier_callback(struct rcu_head *notused)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/