Re: [RFC][PATCH v2 1/2] printk: Make printk() completely async

From: Sergey Senozhatsky
Date: Sun Mar 06 2016 - 08:25:55 EST


On (03/06/16 20:06), Tetsuo Handa wrote:
[..]
> > do you mean a new worker allocation delay and a MAYDAY timer delay?
> >
>
> I don't know what MAYDAY is. I'm talking about a situation where printing_work
> work item is not processed (i.e. printing_work_func() is not called) until
> current work item calls schedule_timeout_*().
>
> We had a problem that since vmstat_work work item was using system_wq,
> vmstat_work work item was not processed (i.e. vmstat_update() was not called) if
> kworker was looping inside memory allocator without calling schedule_timeout_*()
> due to disk_events_workfn() doing GFP_NOIO allocation).

hm, just for note, none of system-wide wqs seem to have a ->rescuer thread
(WQ_MEM_RECLAIM).

[..]
> Even if you use printk_wq with WQ_MEM_RECLAIM for printing_work work item,
> printing_work_func() will not be called until current work item calls
> schedule_timeout_*(). That will be an undesirable random delay. If you use
> a dedicated kernel thread rather than a dedicated workqueue with WQ_MEM_RECLAIM,
> we can avoid this random delay.

hm. yes, seems that it may take some time until workqueue wakeup() a ->rescuer thread.
need to look more.

[..]
> > console_lock();
> > for (...) {
> > do_foo() {
> > ...
> > pr_err(" ... foo message ...\n");
> > ...
> > }
> > }
> > console_unlock();
> >
> > then yes, nothing will be printed until that process executes console_unlock(),
> > because it's console_unlock() that pushes the messages to console drivers.
>
> Yes, I meant a sequence like
>
> console_lock();
> ptr = kmalloc(GFP_KERNEL);
> kfree(ptr);
> console_unlock();
>
> and kmalloc() prints OOM killer messages rather than failing that allocation.
> Are we sure that there is no such usage?

such usage is quite possible.

problems that I have with console_lock()/console_unlock() is that
these functions serve a double purpose: exclusive printk() lock and a
console_drivers list lock.

**** I haven't really thought about it yet, but I want to split it. ****

console_lock()/console_unlock() can be executed by user space processes
(inside system calls). for example:

SyS_open()
...
chrdev_open()
tty_open()
console_device()
console_lock()
console_unlock()
for (;;) {
call_console_drivers()
}

or doing `cat /proc/consoles`

SyS_read()
vfs_read()
proc_reg_read()
seq_read()
c_stop()
console_unlock()
for (;;) {
call_console_drivers()
}

which can introduce two nasty problems:

1) console_lock() may put user space process in TASK_UNINTERRUPTIBLE for
unknown period of time -- until current console_sem owner will not
finish print loop in console_unlock(). no signals, heartbeats, etc.
will be processed by this user space process.

2) user space process may have to spend an unknown period of time in
console_unlock() later, pushing "3rd party" kernel messages to
console drivers. again, not really good.

(kthreads can suffer here too, sure).


in the examples above, a process just wanted to iterate the console_drivers
list in read access mode. so, for instance, in

struct tty_driver *console_device(int *index)
{
struct console *c;
struct tty_driver *driver = NULL;

console_lock();
for_each_console(c) {
if (!c->device)
continue;
driver = c->device(c, index);
if (driver)
break;
}
console_unlock();
return driver;
}

instead of console_lock()/console_unlock()->call_console_drivers()
it could do (very schematically)

read_lock_console();
for_each_console(c) {
if (!c->device)
continue;
driver = c->device(c, index);
if (driver)
break;
}
read_unlock_console();

and in functions that modify the list, the lock can be acquired in
write mode. example,

int unregister_console(struct console *console)
{
write_lock_console();
...
for (a=console_drivers->next, b=console_drivers ;
a; b=a, a=b->next) {
if (a == console) {
b->next = a->next;
res = 0;
break;
}
}
}
...
if (console_drivers != NULL && console->flags & CON_CONSDEV)
console_drivers->flags |= CON_CONSDEV;

console->flags &= ~CON_ENABLED;
write_unlock_unlock();



printk(), thus, will take its own "another" exclusive lock, to guarantee
that only one process can call_console_drivers(), and it will take the
console_drivers list lock in read mode. so other process(-es) that also
want to access console_drivers list in read mode will not wait in
TASK_UNINTERRUPTIBLE.

of course, this means that console_unlock() now does not print
anything to console drivers. it's printk() duty to do this.

and the last statement can be very hard to sell; because I don't know
for sure if there is a code in the kernel that depends on the fact that
console_lock() stops printk() and that console_unlock() prints all of
the printk messages.


that "lock split" also can fix another *theoretical* scenario: suppose,
that for some reason someone's setup has both a) huge number of printk()
calls and b) a relatively huge number of console_lock()/console_unlock()
calls, all happening simultaneously. while async printk helps in (a),
case (b) "detours" the printk async code; and if cpus at lest some of the
cpus that do console_lock()/console_unlock() also disable IRQs or
preemption, then lockups and all the bad things are very much likely.


just an idea.

-ss