Re: [RFC][PATCH 0/4] printk: introduce printing kernel thread

From: Petr Mladek
Date: Tue Apr 04 2017 - 09:00:06 EST


On Mon 2017-04-03 20:53:11, Sergey Senozhatsky wrote:
> On (03/06/17 21:45), Sergey Senozhatsky wrote:
> [..]
> > printk kthread changes the behavior of printk in one _corner case_.
> > The corner case is quite interesting and actually consists of two corner
> > cases. Suppose on SMP system there is only one CPU that printk()-s a lot,
> > the rest of CPUs don't lock console_sem and don't printk(). Previously
> > that printing CPU had been throttling itself (*) because of console drivers
> > call for every printk():
> >
> > CPU0
> >
> > printk("a")
> > console_unlock()
> > call_console_drivers("a")
> >
> > ...
> >
> > printk("z")
> > console_unlock()
> > call_console_drivers("z")
> >
> > * Given that no other CPU locks the console_sem.
> >
> > With printk kthread the case turns into this one:
> >
> > CPU0 CPU1
> >
> > printk("a")
> > wake_up printk_kthread
> > ... printk_kthread
> > printk("k") console_unlock()
> > ... call_console_drivers("a")
> > printk("z") call_console_drivers("b")
> > call_console_drivers("c")
> > ...
> >
> >
> > The second 'corner case' part here is that CPU0 may be much faster
> > than printing CPU, which may result in dropped printk messages.
> >
> > This all is absolutely possible even with out the printk-kthread.
> > A single console_lock() call from CPUx will result in exactly the
> > same condition. So it's not necessarily a regression. But there may
> > be some scenarios in the kernel that may suffer from this change.
> > From the top of my head -- sysrq backtrace dump, and, probably, OOM
> > print out and backtrace dump.
>
> there is another possibility here.
>
> being always reschedulable potentially can put us at risk of having
> unpleasant situations when printk_kthread is getting preempted too
> often (well, who knows what can happen on the system), which can slow
> down logbuf emit process (printk_kthread) up to the point when printk()
> CPUs will force log_store() to begin dropping the messages. this can
> happen.

I believe that this will be rather a corner case. If it happens, we
could do something with scheduling priority and policy. Also there
is the possibility to fallback to the old mode.

We use some variants of the printk offload on SLE for years and
I am not aware of any complains of this sort.

Best Regards,
Petr