Re: Soft lockup issue in Linux 4.1.9

From: Holger HoffstÃtte
Date: Thu Oct 08 2015 - 15:27:58 EST


On 10/08/15 18:56, Christoph Biedl wrote:
> Eric Dumazet wrote...
>
> [ commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af ]
>
>> It definitely should help !
>
> Yesterday, I've experienced issues somewhat similar to this, but I'm
> not entirely sure:
>
> Four of five systems running 4.1.9 stopped working. No reaction on
> network, keyboard, serial console. In one case, the stack trace as
> below made it to the loghost.
>
> Two things are quite different. First, the systems had a reasonable
> uptime, about a week.
>
> And second, the scary part: All incidents happened within a rather
> short time span of three minutes the most, beginning after 16:41:28 and
> before 16:41:54 UTC. So I assumed a brownout first - until I realized
> the systems faded away at slightly different times, and one is at a
> different location. While other systems using different kernel versions
> continued to operate on both sites.
>
> So, I'd be glad for answers for
>
> - Is this the same issue or should I be even more afraid?

There's always room for more. :-)

> - What might be the reason for this temporal coincidence? I have no
> plausible idea.

More bugs?

> Confused,
> Christoph
>
>
> INFO: rcu_sched self-detected stall on CPU { 3} (t=6000 jiffies g=8932806 c=8932805 q=58491)
> rcu_sched kthread starved for 5999 jiffies!
> Task dump for CPU 3:
> swapper/3 R running task 0 0 1 0x00000008
> ffffffff81e396c0 ffff88042dcc3b20 ffffffff810807da 0000000000000003
> ffffffff81e396c0 ffff88042dcc3b40 ffffffff81083b78 ffff88042dcc3b80
> 0000000000000003 ffff88042dcc3b70 ffffffff810a945c ffff88042dcd5740
> Call Trace:
> <IRQ> [<ffffffff810807da>] sched_show_task+0xaa/0x110
> [<ffffffff81083b78>] dump_cpu_task+0x38/0x40
> [<ffffffff810a945c>] rcu_dump_cpu_stacks+0x8c/0xc0
> [<ffffffff810abf31>] rcu_check_callbacks+0x3b1/0x680
> [<ffffffff810e7bb7>] ? acct_account_cputime+0x17/0x20
> [<ffffffff8108484e>] ? account_system_time+0x8e/0x180
> [<ffffffff810ae4d3>] update_process_times+0x33/0x60
> [<ffffffff810bcae0>] tick_sched_handle.isra.14+0x30/0x40
> [<ffffffff810bcbd3>] tick_sched_timer+0x43/0x80
> [<ffffffff810aea2a>] __run_hrtimer.isra.32+0x4a/0xd0
> [<ffffffff810af225>] hrtimer_interrupt+0xd5/0x1f0
> [<ffffffff81034d84>] local_apic_timer_interrupt+0x34/0x60
> INFO: rcu_sched self-detected stall on CPU { 3} (t=6000 jiffies g=8932806 c=8932805 q=58491)
> rcu_sched kthread starved for 5999 jiffies!
> Task dump for CPU 3:
> swapper/3 R running task 0 0 1 0x00000008
> ffffffff81e396c0 ffff88042dcc3b20 ffffffff810807da 0000000000000003
> ffffffff81e396c0 ffff88042dcc3b40 ffffffff81083b78 ffff88042dcc3b80
> 0000000000000003 ffff88042dcc3b70 ffffffff810a945c ffff88042dcd5740
> Call Trace:
> <IRQ> [<ffffffff810807da>] sched_show_task+0xaa/0x110
> [<ffffffff81083b78>] dump_cpu_task+0x38/0x40
> [<ffffffff8103516c>] smp_apic_timer_interrupt+0x3c/0x60
> [<ffffffff8190db7b>] apic_timer_interrupt+0x6b/0x70
> [<ffffffff8190c8a9>] ? _raw_spin_unlock_irqrestore+0x9/0x10
> [<ffffffff810ade58>] try_to_del_timer_sync+0x48/0x60
> [<ffffffff810adeb2>] ? del_timer_sync+0x42/0x60
> [<ffffffff810adeba>] del_timer_sync+0x4a/0x60
> [<ffffffff8178b7da>] inet_csk_reqsk_queue_drop+0x7a/0x1f0
> [<ffffffff8178ba7f>] reqsk_timer_handler+0x12f/0x290
> [<ffffffff8178b950>] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0
> [<ffffffff810ad9e6>] call_timer_fn.isra.26+0x26/0x80
> [<ffffffff810a945c>] rcu_dump_cpu_stacks+0x8c/0xc0
> [<ffffffff810abf31>] rcu_check_callbacks+0x3b1/0x680
> [<ffffffff810e7bb7>] ? acct_account_cputime+0x17/0x20
> [<ffffffff8108484e>] ? account_system_time+0x8e/0x180
> [<ffffffff810ae4d3>] update_process_times+0x33/0x60
> [<ffffffff810bcae0>] tick_sched_handle.isra.14+0x30/0x40
> [<ffffffff810bcbd3>] tick_sched_timer+0x43/0x80
> [<ffffffff810aea2a>] __run_hrtimer.isra.32+0x4a/0xd0
> [<ffffffff810af225>] hrtimer_interrupt+0xd5/0x1f0
> [<ffffffff81034d84>] local_apic_timer_interrupt+0x34/0x60
> [<ffffffff810ae1ae>] run_timer_softirq+0x18e/0x220
> [<ffffffff81060b1a>] __do_softirq+0xda/0x1f0
> [<ffffffff81060e16>] irq_exit+0x76/0xa0
> [<ffffffff81035175>] smp_apic_timer_interrupt+0x45/0x60
> [<ffffffff8190db7b>] apic_timer_interrupt+0x6b/0x70
> <EOI> [<ffffffff810844be>] ? sched_clock_cpu+0x9e/0xb0
> [<ffffffff8100bc15>] ? amd_e400_idle+0x35/0xd0
> [<ffffffff8100bc13>] ? amd_e400_idle+0x33/0xd0
> [<ffffffff8100c42a>] arch_cpu_idle+0xa/0x10
> [<ffffffff810929e3>] cpu_startup_entry+0x2c3/0x330
> [<ffffffff8103516c>] smp_apic_timer_interrupt+0x3c/0x60
> [<ffffffff8190db7b>] apic_timer_interrupt+0x6b/0x70
> [<ffffffff8190c8a9>] ? _raw_spin_unlock_irqrestore+0x9/0x10
> [<ffffffff810ade58>] try_to_del_timer_sync+0x48/0x60
> [<ffffffff810adeb2>] ? del_timer_sync+0x42/0x60
> [<ffffffff810adeba>] del_timer_sync+0x4a/0x60
> [<ffffffff8178b7da>] inet_csk_reqsk_queue_drop+0x7a/0x1f0
> [<ffffffff8178ba7f>] reqsk_timer_handler+0x12f/0x290
> [<ffffffff8178b950>] ? inet_csk_reqsk_queue_drop+0x1f0/0x1f0
> [<ffffffff810ad9e6>] call_timer_fn.isra.26+0x26/0x80
> [<ffffffff810332dc>] start_secondary+0x17c/0x1a0
> [<ffffffff810ae1ae>] run_timer_softirq+0x18e/0x220
> [<ffffffff81060b1a>] __do_softirq+0xda/0x1f0
> [<ffffffff81060e16>] irq_exit+0x76/0xa0
> [<ffffffff81035175>] smp_apic_timer_interrupt+0x45/0x60
> [<ffffffff8190db7b>] apic_timer_interrupt+0x6b/0x70
> <EOI> [<ffffffff810844be>] ? sched_clock_cpu+0x9e/0xb0
> [<ffffffff8100bc15>] ? amd_e400_idle+0x35/0xd0
> [<ffffffff8100bc13>] ? amd_e400_idle+0x33/0xd0
> [<ffffffff8100c42a>] arch_cpu_idle+0xa/0x10
> [<ffffffff810929e3>] cpu_startup_entry+0x2c3/0x330
> [<ffffffff810332dc>] start_secondary+0x17c/0x1a0
>

The timer fixes were followups to a patch that went into 4.1 called
"tcp/dccp: get rid of central timewait timer", and it seems there were
a few more patches in that area very recently.

So after some git spelunking I am now running with the following patches
on top of 4.1.10 + 83fccfc3940.. (for the lockups), in the following
order:

fc01538f9fb75572c969ca9988176ffc2a8741d6 simplify timewait refcounting
dbe7faa4045ea83a37b691b12bb02a8f86c2d2e9 inet_twsk_deschedule factorization
29c6852602e259d2c1882f320b29d5c3fec0de04 fix races in reqsk_queue_hash_req()
ed2e923945892a8372ab70d2f61d364b0b6d9054 fix timewait races in timer handling

They may not all be required for the particular problem you just summoned,
but (from what I could tell) are required to apply everything properly.
They certainly can't make things worse. :-)

Oh and while you're at it you can apply these l33t cubic fixes :-)
30927520dbae297182990bb21d08762bcc35ce1d better follow cubic curve after idle period
c2e7204d180f8efc80f27959ca9cf16fa17f67db do not set epoch_start in the future

I've been running these on 3 machines for almost 10 minutes without issue,
so they are totally safe to go into production right away.

-h

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/