Stall in serial8250_console_putchar hangs the system

From: Nikolay Borisov
Date: Mon Sep 28 2015 - 09:24:36 EST


Hello,

I'm running the stable 3.12.47 kernel and today one of the server
started reporting softlockups and rcushed stalls. I believe the 2
things are related with the culprit being the following:

INFO: rcu_sched detected stalls on CPUs/tasks: { 9} (detected by 6, t=10020747 jiffies, g=5324883, c=5324882, q=284881394)

The backtrace for CPU#9 is this:

NMI backtrace for cpu 9
CPU: 9 PID: 8315 Comm: zip Not tainted 3.12.47-clouder1 #1
Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1 04/14/2015
task: ffff883f5762a040 ti: ffff883f45584000 task.ti: ffff883f45584000
RIP: 0010:[<ffffffff813ccfe5>] [<ffffffff813ccfe5>] io_serial_in+0x15/0x20
RSP: 0018:ffff883f45585968 EFLAGS: 00000002
RAX: ffff883f45585f00 RBX: ffffffff820448c0 RCX: 0000000000000000
RDX: 00000000000003fd RSI: 00000000000003fd RDI: ffffffff820448c0
RBP: ffff883f45585968 R08: 00000000ab46d39e R09: 0000000000000009
R10: 0000000000000001 R11: 0000000000000000 R12: 000000000000260e
R13: 0000000000000020 R14: 00000000000000bc R15: ffffffff813cd6a0
FS: 00002b33dc0f9b20(0000) GS:ffff881fffb20000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002a00020 CR3: 0000003f84481000 CR4: 00000000001407e0
Stack:
ffff883f45585998 ffffffff813cd63a 0000000000000000 ffffffff820448c0
0000000000000020 00000000000000c9 ffff883f455859b8 ffffffff813cd6c6
ffffffff81e070bb ffffffff820448c0 ffff883f455859f8 ffffffff813c8d7d
Call Trace:
[<ffffffff813cd63a>] wait_for_xmitr+0x3a/0xa0
[<ffffffff813cd6c6>] serial8250_console_putchar+0x26/0x40
[<ffffffff813c8d7d>] uart_console_write+0x3d/0x70
[<ffffffff813cf5d6>] serial8250_console_write+0xb6/0x180
[<ffffffff810d0105>] call_console_drivers.clone.3+0xa5/0x110
[<ffffffff810d0408>] console_unlock+0x298/0x420
[<ffffffff810d0ab5>] vprintk_emit+0x2b5/0x4a0
[<ffffffff816515a4>] printk+0x4d/0x4f
[<ffffffff8126f2c2>] __ext4_error_inode+0x132/0x140
[<ffffffff811d7990>] ? submit_bh+0x10/0x20
[<ffffffff811d86ce>] ? __wait_on_buffer+0x2e/0x30
[<ffffffff8124dd76>] __ext4_check_dir_entry+0x146/0x150
[<ffffffff81263510>] htree_dirblock_to_tree+0xe0/0x190
[<ffffffff811b1ecd>] ? link_path_walk+0x29d/0x8f0
[<ffffffff812636da>] ext4_htree_fill_tree+0x11a/0x2d0
[<ffffffff811c4df1>] ? mntput+0x21/0x30
[<ffffffff811b562f>] ? path_openat+0x15f/0x610
[<ffffffff8118ad04>] ? kmem_cache_alloc_trace+0xe4/0x1e0
[<ffffffff8124decc>] ext4_readdir+0x14c/0x830
[<ffffffff811b7e25>] iterate_dir+0xb5/0xe0
[<ffffffff811afc36>] ? final_putname+0x26/0x50
[<ffffffff811b7fe9>] SyS_getdents+0x99/0x110
[<ffffffff811b7b60>] ? filldir64+0x120/0x120
[<ffffffff81657272>] system_call_fastpath+0x16/0x1b


Basically what has happened is that a command has tried to read a
directory but has encountered some error (in this case the command is
zip, but this is inconsequential). What happens then is that
__ext4_check_dir_entry proceeds to print an error and since the kernel
has been booted with the following parameters:

ro root=/dev/sda2 console=tty1 console=ttyS0,9600,n8 crashkernel=256M panic=5 ipv6.disable=1 elevator=noop pcie_aspm=off biosdevname=0 intel_idle.max_cstate=0 i915.i915_enable_rc6=0 consoleblank=0

so ttyS0 is actually the serial console that the BMC exposes. Reading
the code in wait_for_xmitr I can see that there is a timeout of 10ms
for transmitting the character and 1 second for possible flow control
messages and so in the worst possible case this function should timeout
and not cause rcu stalls. I've also comapred the code for the 4.2 kernel
and it seems this portion hasn't changed at all.


On the other hand the softlockup at the same time looks like the
following:

CPU: 2 PID: 21106 Comm: iptables Not tainted 3.12.47-clouder1 #1
Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1 04/14/2015
task: ffff883cea0a8810 ti: ffff883f57542000 task.ti: ffff883f57542000
RIP: 0010:[<ffffffff810e7f81>] [<ffffffff810e7f81>] smp_call_function_many+0x241/0x290
RSP: 0018:ffff883f575439f8 EFLAGS: 00000202
RAX: 0000000000000009 RBX: ffffffff81081aa0 RCX: 0000000000000009
RDX: ffff881fffb35478 RSI: 0000000000000040 RDI: 0000000000000040
RBP: ffff883f57543a58 R08: ffff881fffa51ec8 R09: 0000000000000000
R10: 0000000000000000 R11: ffffffff81175782 R12: 0000000000000000
R13: ffff881fffa51ec0 R14: ffff881fffa51ec8 R15: 0000000000011f00
FS: 0000000000000000(0000) GS:ffff881fffa40000(0063) knlGS:0000000055557fd0
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000096a65cc CR3: 0000001bb4a82000 CR4: 00000000001407e0
Stack:
ffff883fff4f5478 ffff883fff4f1f00 ffff883fff4f1f10 000000000000002f
01ff883f57543b98 0000000000000002 ffff88407fff8f00 ffff883f57542000
0000000000000000 ffff883f57543b60 ffff883f57543b48 000000000000c005
Call Trace:
[<ffffffff810e819a>] smp_call_function+0x3a/0x50
[<ffffffff81081aa0>] ? leave_mm+0x70/0x70
[<ffffffff810e821a>] on_each_cpu+0x3a/0xa0
[<ffffffff8108192f>] flush_tlb_kernel_range+0x4f/0x80
[<ffffffffa0299ccf>] ? nf_ct_l3proto_try_module_get+0x1f/0x60 [nf_conntrack]
[<ffffffff811740b9>] __purge_vmap_area_lazy+0x129/0x4d0
[<ffffffff811889ba>] ? __slab_free+0x2ea/0x3c0
[<ffffffff811889ba>] ? __slab_free+0x2ea/0x3c0
[<ffffffff811422e3>] ? free_pcppages_bulk+0x1b3/0x440
[<ffffffff811744c0>] free_vmap_area_noflush+0x60/0x70
[<ffffffff81175525>] free_unmap_vmap_area+0x25/0x30
[<ffffffff8117559d>] remove_vm_area+0x6d/0x80
[<ffffffff811756ee>] __vunmap+0x2e/0x110
[<ffffffff8117586b>] vfree+0x3b/0x80
[<ffffffff815c9b1d>] xt_free_table_info+0x6d/0x140
[<ffffffff81623451>] compat_do_replace+0x921/0xa10
[<ffffffff8162380f>] compat_do_ipt_set_ctl+0x7f/0xa0
[<ffffffff815c6747>] compat_nf_sockopt+0x67/0xe0
[<ffffffff815c6801>] compat_nf_setsockopt+0x21/0x30
[<ffffffff815d9a5a>] compat_ip_setsockopt+0xaa/0xc0
[<ffffffff815face6>] compat_raw_setsockopt+0x16/0x30
[<ffffffff8157e10c>] compat_sock_common_setsockopt+0x1c/0x30
[<ffffffff815b4269>] compat_sys_setsockopt+0x79/0x260
[<ffffffff815b4545>] compat_sys_socketcall+0xf5/0x210
[<ffffffff81658ab3>] sysenter_dispatch+0x7/0x23

My guess is that the this occurs since CPU 9 is blocked with
irqs disabled from serial8250_console_write, which causes the
IPI necessary for the tlb invalidation to never be received.

So I guess the pertinent question is how come wait_for_xmitr
block and never return since there are timeouts in the logic
of the function? And also is it normal that such a failure
mode could cause the whole machine to be unstable?

Regards,
Nikolay

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/