Re: [RFC PATCH] tick/sched: update full_nohz status after SCHED dep is cleared

From: Juri Lelli
Date: Tue Jun 16 2020 - 02:58:11 EST


On 15/06/20 23:07, Frederic Weisbecker wrote:
> On Thu, May 21, 2020 at 07:00:20PM +0200, Juri Lelli wrote:

[...]

> > sysjitter-2377 [004] 100.438495: sched_switch: sysjitter:2377 [120] R ==> ksoftirqd/4:31 [120]
> > ksoftirqd/4-31 [004] 100.438497: softirq_entry: vec=1 [action=TIMER]
> > ksoftirqd/4-31 [004] 100.438499: timer_cancel: timer=0xffffffffa55a9d20
> > ksoftirqd/4-31 [004] 100.438501: timer_expire_entry: timer=0xffffffffa55a9d20 function=clocksource_watchdog now=4294759824 baseclk=4294759824
> > ksoftirqd/4-31 [004] 100.438504: timer_start: timer=0xffffffffa55a9d20 function=clocksource_watchdog expires=4294760322 [timeout=498] cpu=5 idx=113 flags=D|P|I

[...]

> > ksoftirqd/4-31 [004] 100.438509: hrtimer_cancel: hrtimer=0xffff92e9df91fd20
> > ksoftirqd/4-31 [004] 100.438511: hrtimer_expire_entry: hrtimer=0xffff92e9df91fd20 now=92630515022 function=tick_sched_timer/0x0
> > ksoftirqd/4-31 [004] 100.438515: kernel_stack: <stack trace>
> > => __ftrace_trace_stack (ffffffffa37da921)
> > => __raise_softirq_irqoff (ffffffffa36daf50)
> > => raise_softirq (ffffffffa36db0fe)
> > => rcu_sched_clock_irq (ffffffffa375af4a)
> > => update_process_times (ffffffffa3768fa4)
> > => tick_sched_handle (ffffffffa377aaa2)
> > => tick_sched_timer (ffffffffa377ad53)
> > => __hrtimer_run_queues (ffffffffa3769cf0)
> > => hrtimer_interrupt (ffffffffa376a58a)
> > => smp_apic_timer_interrupt (ffffffffa40028f8)
> > => apic_timer_interrupt (ffffffffa4001b7f)
> > => filter_pred_32 (ffffffffa37f3357)
> > => filter_match_preds (ffffffffa37f3510)
> > => trace_event_buffer_commit (ffffffffa37dc7eb)
> > => trace_event_raw_event_softirq (ffffffffa36dab77)
> > => __do_softirq (ffffffffa420025a)
> > => run_ksoftirqd (ffffffffa36dadc6)
> > => smpboot_thread_fn (ffffffffa36ffdb8)
> > => kthread (ffffffffa36f9fb7)
> > => ret_from_fork (ffffffffa4000215)
> > ksoftirqd/4-31 [004] 100.438516: softirq_raise: vec=9 [action=RCU]
> > ksoftirqd/4-31 [004] 100.438520: kernel_stack: <stack trace>
> > => __ftrace_trace_stack (ffffffffa37da921)
> > => __raise_softirq_irqoff (ffffffffa36daf50)
> > => raise_softirq (ffffffffa36db0fe)
> > => trigger_load_balance (ffffffffa371cb9c)
> > => update_process_times (ffffffffa3768fc7)
> > => tick_sched_handle (ffffffffa377aaa2)
> > => tick_sched_timer (ffffffffa377ad53)
> > => __hrtimer_run_queues (ffffffffa3769cf0)
> > => hrtimer_interrupt (ffffffffa376a58a)
> > => smp_apic_timer_interrupt (ffffffffa40028f8)
> > => apic_timer_interrupt (ffffffffa4001b7f)
> > => filter_pred_32 (ffffffffa37f3357)
> > => filter_match_preds (ffffffffa37f3510)
> > => trace_event_buffer_commit (ffffffffa37dc7eb)
> > => trace_event_raw_event_softirq (ffffffffa36dab77)
> > => __do_softirq (ffffffffa420025a)
> > => run_ksoftirqd (ffffffffa36dadc6)
> > => smpboot_thread_fn (ffffffffa36ffdb8)
> > => kthread (ffffffffa36f9fb7)
> > => ret_from_fork (ffffffffa4000215)
> > ksoftirqd/4-31 [004] 100.438520: softirq_raise: vec=7 [action=SCHED]
> > ksoftirqd/4-31 [004] 100.438521: hrtimer_expire_exit: hrtimer=0xffff92e9df91fd20
> > ksoftirqd/4-31 [004] 100.438523: hrtimer_start: hrtimer=0xffff92e9df91fd20 function=tick_sched_timer/0x0 expires=92631512937 softexpires=92631512937
> > ksoftirqd/4-31 [004] 100.438525: local_timer_exit: vector=236
> > ksoftirqd/4-31 [004] 100.438527: tick_stop: success=0 dependency=SCHED

[...]

> > ksoftirqd/4-31 [004] 100.438530: softirq_exit: vec=1 [action=TIMER]
>
> Was there a corresponding softirq_enter? With timers expiring inside?

The one up above, at 100.438497.

[...]

> >
> > Does this make sense and help in any way? Suggestions for debugging this
> > further? :-)
>
> Let's see if I can reproduce it first. If not I'll need to bother you further :)

Sure. Let me know if you find anything.

Best,

Juri