Re: [Query] Preemption (hogging) of the work handler

From: Viresh Kumar
Date: Thu Jul 14 2016 - 18:13:08 EST

On 14-07-16, 16:12, Jan Kara wrote:
> Exactly. Calling printk() from certain parts of the kernel (like scheduler
> code or timer code) has been always unsafe because printk itself uses these
> parts and so it can lead to deadlocks. That's why printk_deffered() has
> been introduced as you mention below.
> And with sync printk the above deadlock doesn't trigger only by chance - if
> there happened to be a waiter on console_sem while we suspend, the same
> deadlock would trigger because up(&console_sem) will try to wake him up and
> the warning in timekeeping code will cause recursive printk.
> So I think your patch doesn't really address the real issue - it only
> works around the particular WARN_ON(timekeeping_enabled) warning but if
> there was a different warning in timekeeping code which would trigger, it
> has a potential for causing recursive printk deadlock (and indeed we had
> such issues previously - see e.g. 504d58745c9c "timer: Fix lock inversion
> between hrtimer_bases.lock and scheduler locks").
> So there are IMHO two issues here worth looking at:
> 1) I didn't find how a wakeup would would lead to calling to ktime_get() in
> the current upstream kernel or even current RT kernel. Maybe this is a
> problem specific to the 3.10 kernel you are using? If yes, we don't have to
> do anything for current upstream AFAIU.

I haven't checked that earlier, but I see the path in both 3.10 and mainline.

-> wake_up_process
-> try_to_wake_up
-> ttwu_queue
-> ttwu_do_activate
-> ttwu_activate
-> activate_task
-> enqueue_task (sched/core.c)
-> enqueue_task_rt (rt.c)
-> enqueue_rt_entity
-> __enqueue_rt_entity
-> inc_rt_tasks
-> inc_rt_group
-> start_rt_bandwidth
-> start_bandwidth_timer
-> __hrtimer_start_range_ns
-> ktime_get()

> If I just missed how wakeup can call into ktime_get() in current upstream,
> there is another question:
> 2) Is it OK that printk calls wakeup so late during suspend?

To clarify again to everybody, we are talking about the place where all non-boot
CPUs are already hot-unplugged and the last running one has disabled interrupts.

I believe that we can't do migration at all now, right? What will we get by
calling wake_up_process() now anyway ?

> I believe it
> is but I'm neither scheduler nor suspend expert. If it is OK, and wakeup
> can lead to ktime_get() in current upstream, then this contradicts the
> check WARN_ON(timekeeping_suspended) in ktime_get() and something is wrong.
> Adding Thomas to CC as timer / RT expert...