Re: [Bug #12650] Strange load average and ksoftirqd behavior with2.6.29-rc2-git1
From: Ingo Molnar
Date: Mon Feb 16 2009 - 04:51:22 EST
* Damien Wyart <damien.wyart@xxxxxxx> wrote:
> Hello,
>
> Ok, I've redone the tests under tip from this morning (Paris time).
> Everything is in http://damien.wyart.free.fr/ksoftirqd_pb/
>
> * Ingo Molnar <mingo@xxxxxxx> [2009-02-15 20:31]:
> > Yes, an abstime trace would be useful.
>
> The corresponding file is trace_tip_2009.02.16_ksoftirqd_pb_abstime.txt.gz
> and there is also a trace without abstime:
> trace_tip_2009.02.16_ksoftirqd_pb.txt.gz
hm, we need a trace with both abstime and process information included:
echo funcgraph-proc > trace_options
echo funcgraph-abstime > trace_options
Also, at 140 msecs the duration is a bit short - could you please make a
1-2 seconds capture? You can do that by increasing the number in
buffer_size_kb 10-fold:
echo 14100 > buffer_size_kb
(your defaults might be different)
You can see the duration of the trace by looking at the first timestamp
and the last one:
310.846260 | 0) 2.380 us | }
[...]
457.712729 | 1) | dequeue_task() {
Hm ... even with this limited trace, there's an unusually high amount of
RCU activities. Each activity goes like this:
457.680976 | 1) | do_softirq() {
457.680976 | 1) | __do_softirq() {
457.680977 | 1) | rcu_process_callbacks() {
457.680977 | 1) | __rcu_process_callbacks() {
4 457.680978 | 1) 0.478 us | force_quiescent_state();
457.680979 | 1) 1.591 us | }
457.680979 | 1) | __rcu_process_callbacks() {
457.680980 | 1) 0.478 us | force_quiescent_state();
457.680981 | 1) | cpu_quiet() {
457.680981 | 1) 0.506 us | _spin_lock_irqsave();
457.680982 | 1) 0.496 us | _spin_unlock_irqrestore();
457.680983 | 1) 2.545 us | }
457.680984 | 1) 4.626 us | }
457.680985 | 1) 7.823 us | }
457.680985 | 1) 0.496 us | _local_bh_enable();
457.680986 | 1) 9.845 us | }
457.680987 | 1) + 10.962 us | }
I've Cc:-ed Paul, as you have tree-RCU enabled, which is a new feature
in v2.6.29:
#
# RCU Subsystem
#
# CONFIG_CLASSIC_RCU is not set
CONFIG_TREE_RCU=y
# CONFIG_PREEMPT_RCU is not set
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=32
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_PREEMPT_RCU_TRACE is not set
Damien, as an experiment, if you change your config to
CONFIG_CLASSIC_RCU=y, does the ksoftirqd problem go away?
On the other hand ... the softirq processing there looks anomalous in
itself, and might be due to some compiler bug perhaps. Could you try the
debug patch below please (you'll get it automatically if you update to
latest -tip), and redo the trace - the ftrace_printk() info should now
be embedded in the trace. The expected result would be for the printed
ot value to be non-zero at the #1 point, and zero at the #2 point.
Note: if it's a compiler optimization bug then this patch might make the
whole problem go away.
Ingo
------------------>