Re: [rcu_sched stall] regression/miss-config ?
From: Paul E. McKenney
Date: Fri May 20 2016 - 00:05:30 EST
On Thu, May 19, 2016 at 04:45:26PM -0700, Santosh Shilimkar wrote:
> Hi Paul,
>
> On 5/17/2016 12:15 PM, Paul E. McKenney wrote:
> >On Tue, May 17, 2016 at 06:46:22AM -0700, santosh.shilimkar@xxxxxxxxxx wrote:
> >>On 5/16/16 5:58 PM, Paul E. McKenney wrote:
> >>>On Mon, May 16, 2016 at 12:49:41PM -0700, Santosh Shilimkar wrote:
> >>>>On 5/16/2016 10:34 AM, Paul E. McKenney wrote:
> >>>>>On Mon, May 16, 2016 at 09:33:57AM -0700, Santosh Shilimkar wrote:
> >>
> >>[...]
> >>
> >>>>>Are you running CONFIG_NO_HZ_FULL=y? If so, the problem might be that
> >>>>>you need more housekeeping CPUs than you currently have configured.
> >>>>>
> >>>>Yes, CONFIG_NO_HZ_FULL=y. Do you mean "CONFIG_NO_HZ_FULL_ALL=y" for
> >>>>book keeping. Seems like without that clock-event code will just use
> >>>>CPU0 for things like broadcasting which might become bottleneck.
> >>>>This could explain connect the hrtimer_interrupt() path getting slowed
> >>>>down because of book keeping bottleneck.
> >>>>
> >>>>$cat .config | grep NO_HZ
> >>>>CONFIG_NO_HZ_COMMON=y
> >>>># CONFIG_NO_HZ_IDLE is not set
> >>>>CONFIG_NO_HZ_FULL=y
> >>>># CONFIG_NO_HZ_FULL_ALL is not set
> >>>># CONFIG_NO_HZ_FULL_SYSIDLE is not set
> >>>>CONFIG_NO_HZ=y
> >>>># CONFIG_RCU_FAST_NO_HZ is not set
> >>>
> >>>Yes, CONFIG_NO_HZ_FULL_ALL=y would give you only one CPU for all
> >>>housekeeping tasks, including the RCU grace-period kthreads. So you are
> >>>booting without any nohz_full boot parameter? You can end up with the
> >>>same problem with CONFIG_NO_HZ_FULL=y and the nohz_full boot parameter
> >>>that you can with CONFIG_NO_HZ_FULL_ALL=y.
> >>>
> >>I see. Yes, the systems are booting without nohz_full boot parameter.
> >>Will try to add more CPUs to it & update the thread
> >>after the verification since it takes time to reproduce the issue.
> >>
> >>Thanks for discussion so far Paul. Its very insightful for me.
> >
> >Please let me know how things go with further testing, especially with
> >the priority setting.
> >
> Sorry for delay. I manage to get information about XEN usecase
> custom config as discussed above. To reduce variables, I disabled
> "CONFIG_NO_HZ_FULL" altogether. So the effective setting was:
>
> CONFIG_NO_HZ_IDLE=y
> # CONFIG_NO_HZ_FULL is not set
> CONFIG_TREE_RCU_TRACE=y
> CONFIG_RCU_KTHREAD_PRIO=1
> CONFIG_RCU_CPU_STALL_TIMEOUT=21
> CONFIG_RCU_TRACE=y
>
> Unfortunately the XEN test still failed. Log end of
> the email. This test(s) is bit peculiar though since
> its database running in VM with 1 or 2 CPUs. One of
> the suspect is because the database RT processes are
> hogging the CPU(s), kernel RCU thread is not getting chance
> to run which eventually results in stall. Does it
> make sense ?
>
> Please note that its non-preempt kernel using RT processes. ;-)
If you have enough real-time processes to consume all CPU, you will
indeed starve the grace-period kthread, so what you see below would
then be expected behavior.
Try setting CONFIG_RCU_KTHREAD_PRIO to be larger than the real-time
priority of your processes.
Thanx, Paul
> # cat .config | grep PREEMPT
> CONFIG_PREEMPT_NOTIFIERS=y
> # CONFIG_PREEMPT_NONE is not set
> CONFIG_PREEMPT_VOLUNTARY=y
> # CONFIG_PREEMPT is not set
>
> Regards,
> Santosh
> ...
> ....
> rcu_sched kthread starved for 399032 jiffies!
> INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0,
> t=462037 jiffies, g=118888, c=118887, q=0)
> All QSes seen, last rcu_sched kthread activity 462037
> (4296277632-4295815595), jiffies_till_next_fqs=3, root ->qsmask 0x0
> ocssd.bin R running task 0 15375 1 0x00000000
> 0000000000000000 ffff8800ec003bc8 ffffffff810a8581 ffffffff81abf980
> 000000000001d068 ffff8800ec003c28 ffffffff810e9c98 0000000000000000
> 0000000000000086 0000000000000000 0000000000000086 0000000000000082
> Call Trace:
> <IRQ> [<ffffffff810a8581>] sched_show_task+0xb1/0x120
> [<ffffffff810e9c98>] print_other_cpu_stall+0x288/0x2d0
> [<ffffffff810e9e60>] __rcu_pending+0x180/0x230
> [<ffffffff810e9fa5>] rcu_check_callbacks+0x95/0x140
> [<ffffffff810edf92>] update_process_times+0x42/0x70
> [<ffffffff810ffc79>] tick_sched_handle+0x39/0x80
> [<ffffffff810ffee2>] tick_sched_timer+0x52/0xa0
> [<ffffffff810f0424>] __run_hrtimer+0x74/0x1d0
> [<ffffffff810ffe90>] ? tick_nohz_handler+0xc0/0xc0
> [<ffffffff810f07b2>] hrtimer_interrupt+0x102/0x240
> [<ffffffff8100aaae>] xen_timer_interrupt+0x2e/0x130
> [<ffffffff8143347a>] ? add_interrupt_randomness+0x3a/0x1f0
> [<ffffffff81384290>] ? store_cursor_blink+0xc0/0xc0
> [<ffffffff810d9de4>] handle_irq_event_percpu+0x54/0x1b0
> [<ffffffff810dd477>] handle_percpu_irq+0x47/0x70
> [<ffffffff810d9d27>] generic_handle_irq+0x27/0x40
> [<ffffffff813e53ca>] evtchn_2l_handle_events+0x25a/0x260
> [<ffffffff81085541>] ? __do_softirq+0x191/0x2f0
> [<ffffffff813e2c2f>] __xen_evtchn_do_upcall+0x4f/0x90
> [<ffffffff813e3404>] xen_evtchn_do_upcall+0x34/0x50
> [<ffffffff816c761e>] xen_hvm_callback_vector+0x6e/0x80
> <EOI>
> rcu_sched kthread starved for 462037 jiffies!
>
>