Re: [Xen-devel] 4.4: INFO: rcu_sched self-detected stall on CPU

From: Boris Ostrovsky
Date: Fri Mar 25 2016 - 10:45:25 EST

On 03/25/2016 10:05 AM, Steven Haigh wrote:
On 25/03/2016 11:23 PM, Boris Ostrovsky wrote:
On 03/24/2016 10:53 PM, Steven Haigh wrote:
Hi all,

Firstly, I've cross-posted this to xen-devel and the lkml - as this
problem seems to only exist when using kernel 4.4 as a Xen DomU kernel.
I have also CC'ed Greg KH for his awesome insight as maintainer.

Please CC myself into replies - as I'm not a member of the kernel
mailing list - I may miss replies from monitoring the archives.

I've noticed recently that heavy disk IO is causing rcu_sched to detect
stalls. The process mentioned usually goes to 100% CPU usage, and
eventually processes start segfaulting and dying. The only fix to
recover the system is to use 'xl destroy' to force-kill the VM and to
start it again.

The majority of these issues seem to mention ext4 in the trace. This may
indicate an issue there - or may be a red herring.

The gritty details:
INFO: rcu_sched self-detected stall on CPU
#0110-...: (20999 ticks this GP) idle=327/140000000000001/0
softirq=1101493/1101493 fqs=6973
#011 (t=21000 jiffies g=827095 c=827094 q=524)
Task dump for CPU 0:
rsync R running task 0 2446 2444 0x00000088
ffffffff818d0c00 ffff88007fc03c58 ffffffff810a625f 0000000000000000
ffffffff818d0c00 ffff88007fc03c70 ffffffff810a8699 0000000000000001
ffff88007fc03ca0 ffffffff810d0e5a ffff88007fc170c0 ffffffff818d0c00
Call Trace:
<IRQ> [<ffffffff810a625f>] sched_show_task+0xaf/0x110
[<ffffffff810a8699>] dump_cpu_task+0x39/0x40
[<ffffffff810d0e5a>] rcu_dump_cpu_stacks+0x8a/0xc0
[<ffffffff810d4884>] rcu_check_callbacks+0x424/0x7a0
[<ffffffff810a91e1>] ? account_system_time+0x81/0x110
[<ffffffff810a9481>] ? account_process_tick+0x61/0x160
[<ffffffff810e8050>] ? tick_sched_do_timer+0x30/0x30
[<ffffffff810d9749>] update_process_times+0x39/0x60
[<ffffffff810e7aa6>] tick_sched_handle.isra.15+0x36/0x50
[<ffffffff810e808d>] tick_sched_timer+0x3d/0x70
[<ffffffff810da342>] __hrtimer_run_queues+0xf2/0x250
[<ffffffff810da698>] hrtimer_interrupt+0xa8/0x190
[<ffffffff8100c61e>] xen_timer_interrupt+0x2e/0x140
[<ffffffff810c8555>] handle_irq_event_percpu+0x55/0x1e0
[<ffffffff810cbbca>] handle_percpu_irq+0x3a/0x50
[<ffffffff810c7d22>] generic_handle_irq+0x22/0x30
[<ffffffff813e50ff>] __evtchn_fifo_handle_events+0x15f/0x180
[<ffffffff813e5130>] evtchn_fifo_handle_events+0x10/0x20
[<ffffffff813e2233>] __xen_evtchn_do_upcall+0x43/0x80
[<ffffffff813e3ea0>] xen_evtchn_do_upcall+0x30/0x50
[<ffffffff8165deb2>] xen_hvm_callback_vector+0x82/0x90
<EOI> [<ffffffff810baf0d>] ? queued_write_lock_slowpath+0x3d/0x80
[<ffffffff8165bcce>] _raw_write_lock+0x1e/0x30
This looks to me like ext4 failing to grab a lock. Everything above it
(in Xen code) is regular tick interrupt handling which detects the stall.

Your config does not have CONFIG_PARAVIRT_SPINLOCKS so that eliminates
any possible issues with pv locks.

Do you see anything "interesting" in dom0? (e.g. dmesg, xl dmesg,
/var/log/xen/) Are you oversubscribing your guest (CPU-wise)?
There is nothing special being logged anywhere that I can see. dmesg /
xl dmesg on the Dom0 show nothing unusual.

I do share CPUs - but I don't give any DomU more than 2 vcpus. The
physical host has 4 cores - 1 pinned to the Dom0.

I log to a remote syslog on this system - and I've uploaded the entire
log to a pastebin (don't want to do a 45Kb attachment here):

That doesn't look like a full log. In any case, the RCU stall may be a secondary problem --- there is a bunch of splats before the stall.


Not sure if it makes any difference at all, but my DomU config is:
# cat /etc/xen/backup.vm
name = "backup.vm"
memory = 2048
vcpus = 2
cpus = "1-3"
disk = [ 'phy:/dev/vg_raid1_new/backup.vm,xvda,w' ]
vif = [ "mac=00:11:36:35:35:09, bridge=br203,
vifname=vm.backup, script=vif-bridge" ]
bootloader = 'pygrub'
pvh = 1

on_poweroff = 'destroy'
on_reboot = 'restart'
on_crash = 'restart'
cpu_weight = 64

I never had this problem when running kernel 4.1.x - it only started
when I upgraded everything to 4.4 - not exactly a great help - but may
help narrow things down?

[<ffffffff81289b23>] ext4_es_remove_extent+0x43/0xc0
[<ffffffff812692f9>] ext4_clear_inode+0x39/0x80
[<ffffffff8124cafd>] ext4_evict_inode+0x8d/0x4e0
[<ffffffff811df537>] evict+0xb7/0x180
[<ffffffff811df636>] dispose_list+0x36/0x50
[<ffffffff811e054b>] prune_icache_sb+0x4b/0x60
[<ffffffff811c92c1>] super_cache_scan+0x141/0x190
[<ffffffff811655fe>] shrink_slab.part.37+0x1ee/0x390
[<ffffffff8116950c>] shrink_zone+0x26c/0x280
[<ffffffff8116988c>] do_try_to_free_pages+0x15c/0x410
[<ffffffff81169bfa>] try_to_free_pages+0xba/0x170
[<ffffffff8115d1e5>] __alloc_pages_nodemask+0x525/0xa60
[<ffffffff811a839c>] ? kmem_cache_free+0xcc/0x2c0
[<ffffffff811a120d>] alloc_pages_current+0x8d/0x120
[<ffffffff81154601>] __page_cache_alloc+0x91/0xc0
[<ffffffff81154b66>] pagecache_get_page+0x56/0x1e0
[<ffffffff81154ee6>] grab_cache_page_write_begin+0x26/0x40
[<ffffffff8124c811>] ext4_da_write_begin+0xa1/0x300
[<ffffffff8124d3a4>] ? ext4_da_write_end+0x124/0x2b0
[<ffffffff811543a0>] generic_perform_write+0xc0/0x1a0
[<ffffffff81156408>] __generic_file_write_iter+0x188/0x1e0
[<ffffffff812421d0>] ext4_file_write_iter+0xf0/0x340
[<ffffffff811c59fa>] __vfs_write+0xaa/0xe0
[<ffffffff811c6022>] vfs_write+0xa2/0x1a0
[<ffffffff811c6d26>] SyS_write+0x46/0xa0
[<ffffffff8165c02e>] entry_SYSCALL_64_fastpath+0x12/0x71

Some 11 hours later:
sshd[785]: segfault at 1f0 ip 00007f03bb94ae5c sp 00007ffe9eb54470 error
4 in[7f03bb940000+21000]
sh[787]: segfault at 1f0 ip 00007f6b4a0dfe5c sp 00007ffe3d4a71e0 error 4
systemd-cgroups[788]: segfault at 1f0 ip 00007f4baa82ce5c sp
00007ffd28e4c4b0 error 4 in[7f4baa822000+21000]
sshd[791]: segfault at 1f0 ip 00007ff8c8a8ce5c sp 00007ffede9e1c20 error
4 in[7ff8c8a82000+21000]
sshd[792]: segfault at 1f0 ip 00007f183cf75e5c sp 00007ffc81ab7160 error
4 in[7f183cf6b000+21000]
sshd[793]: segfault at 1f0 ip 00007f3c665ece5c sp 00007ffd9a13c850 error
4 in[7f3c665e2000+21000]

From isolated testing, this does not occur on kernel 4.5.x - however I
have not verified this myself.

The kernel config used can be found in the kernel-xen git repo if it
assists in debugging:;a=summary

Xen-devel mailing list