Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

From: Sergey Senozhatsky
Date: Mon Jan 11 2016 - 08:27:32 EST

Hello Jan,

On (01/06/16 13:25), Jan Kara wrote:
> > a quote from -mm a74b6533ead8
> > particularly this "workqueue context is not appropriate because all the workers
> > might be busy (e.g. allocating memory)" part. I think printk should switch to
> > sync mode in this case, since printk now does queue_work(system_wq, work).
> > um... console_verbose() call from oom kill? but it'll be nice to return back
> > to async mode once (if) memory pressure goes away.
> Hum, yes, some mechanism to switch to sync printing in case work cannot be
> executed for a long time is probably needed. I'll think about it.

well, technically, worker_pool keeps ->watchdog_ts updated, so ,basically,
worker pool knows when it stall. with CONFIG_WQ_WATCHDOG enabled timer_fn
wq_watchdog_timer_fn() checks that value and pr_emerg(). in the worst case,
printk can depend on CONFIG_WQ_WATCHDOG (yes, this sounds a bit sad) --
which implies, however, potentially long print from timer_fn. having a
printk() specific timer_fn, that will do the same, is just a duplication of
functionality; and checking the value in every vprintk_emit() is not really
an option too, I'm afraid, there may be no printk calls for some time.
just my 5 cents. probably you have better ideas.

one another thing, include/linux/workqueue.h says

: System-wide workqueues which are always present.
: system_wq is the one used by schedule[_delayed]_work[_on]().
: Multi-CPU multi-threaded. There are users which expect relatively
: short queue flush time. Don't queue works which can run for too
: long.
: system_long_wq is similar to system_wq but may host long running
: works. Queue flushing might take relatively long.
: system_unbound_wq is unbound workqueue. Workers are not bound to
: any specific CPU, not concurrency managed, and all queued works are
: executed immediately as long as max_active limit is not reached and
: resources are available.

wake_up_klogd_work_func() is using `system_wq' to do
'console_lock()/console_unlock()', both of which can take a long time.
should it be switched to `system_long_wq' or `system_unbound_wq'?