Re: [RFC][PATCHv3 2/5] printk: introduce printing kernel thread

From: Sergey Senozhatsky
Date: Fri Jun 30 2017 - 03:01:35 EST


Hello,

On (06/29/17 16:33), Sergey Senozhatsky wrote:
[..]
> On (06/28/17 14:19), Petr Mladek wrote:
> [..]
> > > so I try to minimize the negative impact of RT prio here. printk_kthread
> > > is not special any more. it's an auxiliary kthread that we sometimes
> > > wake_up. the thing is that printk_kthread also must offload at some
> > > point, basically the same `atomic_print_limit' limit applies to it as
> > > well.
> >
> > You might call cond_resched() outside console_unlock(). But you have
> > to keep printk_kthread in runnable state as long as there are pending
> > messages. Then scheduler will always prefer this RT task over non-RT
> > tasks. Or am I wrong?
>
> if we try to offload from IRQ->console_unlock() (or with preemption
> disabled, etc. etc.) and scheduler decides to enqueue printk_kthread
> on the same CPU, then no offloading will take place. I can reproduce
> it on my system. we need to play some affinity games, I think. but
> there are corner cases, once again.

something like this (in console_offload_printing()). try to keep
printk_kthread out of this_cpu->rq, so we (hopefully) wake it up
on some other CPU (if there are other CPUs online), that is not
in console_unlock() now:

---
if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL)) {
wake_up_process(printk_kthread);
return true;
}

cpumask_copy(cpus_allowed, cpu_online_mask);
cpumask_clear_cpu(smp_processor_id(), cpus_allowed);

/*
* If this_cpu is the only one online, then try to wake up
* `printk_kthread' on it. what else we can do...
*/
if (cpumask_empty(cpus_allowed))
cpumask_set_cpu(smp_processor_id(), cpus_allowed);

set_cpus_allowed_ptr(printk_kthread, cpus_allowed);
wake_up_process(printk_kthread);

free_cpumask_var(cpus_allowed);
---

once again, this is merely for testing purposes. I haven't done
many tests yet, will continue testing; but looks promising. there
are some corner cases here.

JFYI, pushed updated patch set to my tree
https://github.com/sergey-senozhatsky/linux-next-ss/commits/printk-kthread-affine

it also contains a debugging patch that I'm using to track
printk_kthread behaviour.


I'm still thinking about Steven's proposals; but we will need offloading
anyways, so the bits we are talking about here are important regardless
the direction printk design will take, I think.


> [..]
> > Our two proposals are very close after all. I suggest to make
> > the following changes in your patch:
> >
> > + Remove the waiting for another console_lock owner. It is
> > too tricky.
>
> we lose the printing guarantees this way. what if printk_kthread
> doesn't wake up after all? the whole point of this design twist
> (and previous discussions) was that people spoke up and said that
> they want printk to do the thing it was doing for decades. even if
> it would cause lockup reports sometimes

to be clear on this.


we are doing our best in order to avoid lockups caused by console_unlock(),
but the top priority remains messages print out. If we can't guarantee that
anything will take over and print the messages, we continue printing from
the current process, even though it may result in lockups.


this is based on my own experience with the previous "wake_up and forget
about it" async printk patch set (v12) which we rolled out to our fleet
of developers' boards; responses we received from the community; and
somehow it also aligned with the recent Linus' reply

: If those two things aren't the absolutely primary goals, the whole
: thing is pointless to even discuss. No amount of cool features,
: performance, or theoretical deadlock avoidance matters ONE WHIT
: compared to the two things above.

// the two things above were -- messages on the screen and dmesg.


so, offloading is cool and avoiding lockups is also very much desired,
but we need to print out the messages in the first place.

-ss