Re: [Bug #12650] Strange load average and ksoftirqd behavior with2.6.29-rc2-git1

From: Ingo Molnar
Date: Mon Feb 16 2009 - 07:26:59 EST



* Damien Wyart <damien.wyart@xxxxxxx> wrote:

> * Ingo Molnar <mingo@xxxxxxx> [090216 10:50]:
> > hm, we need a trace with both abstime and process information included:
>
> > echo funcgraph-proc > trace_options
> > echo funcgraph-abstime > trace_options
>
> > Also, at 140 msecs the duration is a bit short - could you please make a
> > 1-2 seconds capture? You can do that by increasing the number in
> > buffer_size_kb 10-fold:
>
> > echo 14100 > buffer_size_kb
>
> Ok, I've redone a trace with these options enabled. The file is here:
> http://damien.wyart.free.fr/ksoftirqd_pb/trace_tip_2009.02.16_ksoftirqd_pb_abstime_proc.txt.gz

ok, here's the new annotated trace:

799.555279 | 1) ksoftir-2324 | | do_softirq() {
799.555279 | 1) ksoftir-2324 | | __do_softirq() {
799.555280 | 1) ksoftir-2324 | | /* #1 softirq pending: 00000100 */
799.555281 | 1) ksoftir-2324 | | /* #2 softirq pending: 00000000 */
799.555282 | 1) ksoftir-2324 | | rcu_process_callbacks() {
799.555282 | 1) ksoftir-2324 | | __rcu_process_callbacks() {
799.555283 | 1) ksoftir-2324 | 0.479 us | force_quiescent_state();
799.555284 | 1) ksoftir-2324 | 1.576 us | }
799.555284 | 1) ksoftir-2324 | | __rcu_process_callbacks() {
799.555285 | 1) ksoftir-2324 | | force_quiescent_state() {
799.555286 | 1) ksoftir-2324 | | cpu_quiet() {
799.555286 | 1) ksoftir-2324 | 0.518 us | _spin_lock_irqsave();
799.555287 | 1) ksoftir-2324 | 0.506 us | _spin_unlock_irqrestore();
799.555288 | 1) ksoftir-2324 | 2.563 us | }
799.555289 | 1) ksoftir-2324 | 4.624 us | }
799.555289 | 1) ksoftir-2324 | 7.836 us | }
799.555290 | 1) ksoftir-2324 | 0.495 us | _local_bh_enable();
799.555291 | 1) ksoftir-2324 | + 11.550 us | }
799.555291 | 1) ksoftir-2324 | + 12.713 us | }
799.555292 | 1) ksoftir-2324 | 0.524 us | _cond_resched();

We do get 0x100 which is 1 << RCU_SOFTIRQ, i.e. the RCU softirq. Paul,
this indeed seems to be a CONFIG_TREE_RCU=y bug.

What is weird is that RCU_SOFTIRQ gets set again and again - but there's
no raise_softirq() calls. Could you please do a two-CPU trace too via:

echo 3 > /debug/tracing/tracing_cpumask

So that we can see what's happening on the other CPU?

Also, could you please apply the debug patch below (or update to the
very latest -tip tree), so that we get trace entries of softirq triggers
too?

Thanks,

Ingo

-------------->