Re: [Query] Preemption (hogging) of the work handler
From: Rafael J. Wysocki
Date: Wed Jul 13 2016 - 19:09:05 EST
On Wed, Jul 13, 2016 at 5:39 PM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> On 13-07-16, 14:45, Sergey Senozhatsky wrote:
>> On (07/12/16 16:19), Viresh Kumar wrote:
[cut]
>>
>> something like below, perhaps. will this work for you?
>
> Maybe not, as this can still lead to the original bug we were all
> chasing. This may hog some other CPU if we are doing excessive
> printing in suspend :(
How can it hog that CPU, exactly?
> suspend_console() is called quite early, so for example in my case we
> do lots of printing during suspend (not from the suspend thread, but
> an IRQ handled by the USB subsystem, which removes a bus with help of
> some other thread probably).
Why doing a lot of printing from an IRQ is not regarded as a bug?
Are all of those messages printed actually useful?
> That is why my Hacky patch tried to do it after devices are removed
> and irqs are disabled, but before syscore users are suspended (and
> timekeeping is one of them). And so it fixes it for me completely.
>
> IOW, we should switch back to synchronous printing after disabling
> interrupts on the last running CPU.
>
> And I of course agree with Rafael that we would need something similar
> in Hibernation code path as well, if we choose to fix it my way.
Well, the patch proposed by Sergey is sufficient to fix the deadlock
issue and it is not clear that anything more needs to be done.
My suggestion, then, would be to use this patch to start with and see
if things really go worse then.
Thanks,
Rafael