Re: [RFC][PATCH 2/4] printk: offload printing from wake_up_klogd_work_func()

From: Petr Mladek
Date: Fri Mar 17 2017 - 08:21:17 EST


On Mon 2017-03-06 21:45:52, Sergey Senozhatsky wrote:
> Offload printing of printk_deferred() messages from IRQ context
> to a schedulable printing kthread, when possible (the same way
> we do it in vprintk_emit()). Otherwise, console_unlock() can
> force the printing CPU to spend unbound amount of time flushing
> kernel messages from IRQ context.
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>
> ---
> kernel/printk/printk.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 1c4232ca2e6a..6e00073a7331 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2735,9 +2735,16 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
> int pending = __this_cpu_xchg(printk_pending, 0);
>
> if (pending & PRINTK_PENDING_OUTPUT) {
> - /* If trylock fails, someone else is doing the printing */
> - if (console_trylock())
> - console_unlock();
> + if (printk_kthread_enabled()) {
> + wake_up_process(printk_kthread);

I have just noticed a possible race. printk_deferred() does not set
printk_kthread_need_flush_console and there might stay a
pending job:

CPU0 CPU1

printk_kthread_func()

printk_kthread_need_flush_console = false;

console_lock()
console_unlock()

printk_deferred()
vprintk_emit()
irq_work_queue()


<IRQ>
wake_up_klogd_work_func()
if (printk_kthread_enabled())
wake_up_process(printk_kthread);

set_current_state(TASK_INTERRUPTIBLE);
if (!printk_kthread_need_flush_console)
schedule();

Result: printk_kthread goes to sleep even though there is
a pending job.


A solution might be to rename the variable to something like
printk_pending_output, always set it in vprintk_emit() and
clear it in console_unlock() when there are no pending messages.

I think that we have already discussed this in the past.
This solution would also remove one extra cycle if more messages
are handled by one console_unlock() call:

CPU0 CPU1

printk()
vprintk_emit()
printk_kthread_need_flush_console = true;
wake_up_process(printk_kthread)


<printk_kthread>

printk_kthread_need_flush_console
= false;

console_lock()

printk()
vprintk_emit()
printk_kthread_need_flush_console = true;
wake_up_process(printk_kthread)

console_unlock()

set_current_state(TASK_INTERRUPTIBLE);
if (!printk_kthread_need_flush_console)
<fail>

_set_current_state(TASK_RUNNING);

console_lock()
console_unlock()

Result: The second console_unlock() has nothing to do.


If I remember correctly, you were not much happy with this
solution because it did spread the logic. I think that you did not
believe that it was worth fixing the second problem. But fixing
the race might need to spread the logic as well.

I see it the following way. vprintk_emit() is a producer,
console_unlock() is a consumer, and printk_thread is a room
that allows consumer to do its job. The consumer has more
rooms available. The state variable is a flag showing that
there is a pending job, consumer is looking for a room,
and printk_kthread should offer it.

Of course, it is possible that you will find a better
solution.

Best Regards,
Petr