Re: [RFC][PATCHv3 2/5] printk: introduce printing kernel thread
From: Sergey Senozhatsky
Date: Thu Jun 29 2017 - 03:33:24 EST
On (06/28/17 14:19), Petr Mladek wrote:
[..]
> > so I try to minimize the negative impact of RT prio here. printk_kthread
> > is not special any more. it's an auxiliary kthread that we sometimes
> > wake_up. the thing is that printk_kthread also must offload at some
> > point, basically the same `atomic_print_limit' limit applies to it as
> > well.
>
> You might call cond_resched() outside console_unlock(). But you have
> to keep printk_kthread in runnable state as long as there are pending
> messages. Then scheduler will always prefer this RT task over non-RT
> tasks. Or am I wrong?
if we try to offload from IRQ->console_unlock() (or with preemption
disabled, etc. etc.) and scheduler decides to enqueue printk_kthread
on the same CPU, then no offloading will take place. I can reproduce
it on my system. we need to play some affinity games, I think. but
there are corner cases, once again.
> The more I think about it, the more I am persuaded that RT priority
> is no no for printk_kthread.
yeah, in fact, it didn't work as expected. so I dropped that idea
some time ago.
[..]
> > at the same we have better guarantees.
> > we don't just wakeup(printk_kthread) and leave. we wait for any other
> > process to re-take the console_sem. until this happens we can't leave
> > console_unlock().
>
> And this is my problem. I am scared of the waiting. It is very hard
> to predict, especially without RT priority. But it is tricky anyway,
> see above.
but.....
the opposite possibility is that messages either won't be printed
soon (until next printk or console_unlock()) or won't be printed
ever at all (in case of sudden system death). I don't think it's
a good alternative.
[..]
> > hm. I don't want printk_kthread to be special. just because there are cases
> > when printk_kthread won't be there. we had too many problems with relying on
> > printk_kthread in all the corner cases. I want printk_kthread to be just one
> > extra process that can do the printing for us.
> > if we have X tasks sleeping in UNINTERRUPTIBLE on console_sem then
> > we better use them; keeping them in UNINTERRUPTIBLE as long as
> > printk_kthread has pending messages does no good.
>
> I am a bit confused by this paragraph. What exactly makes the
> printk_kthread special in my proposal?
that fact that we rely on it and rely on the scheduler. may be I'm
simply misunderstanding you, sorry if so, but seems that you both want
and don't want to depend on the scheduler at the same time. I choose
not to depend on it. and for this choice to become reasonable we need
to preserve the existing 'direct' print out guarantees. and yes, I
understand that sometimes it may cause good old lockups.
> In addition, it adds another dependecy on the scheduler behavior.
> It is a can of worms as explained by Jan Kara in another mail.
and that's exactly why "wake_up() and leave console_unlock()" is not
going to fly, is it? what am I missing?
[..]
> Our two proposals are very close after all. I suggest to make
> the following changes in your patch:
>
> + Remove the waiting for another console_lock owner. It is
> too tricky.
we lose the printing guarantees this way. what if printk_kthread
doesn't wake up after all? the whole point of this design twist
(and previous discussions) was that people spoke up and said that
they want printk to do the thing it was doing for decades. even if
it would cause lockup reports sometimes (but it doesn't seem to be
such a common problem after all. how many people see printk lockup
reports more or less regularly?).
> + Instead try to reduce sleeping with console_lock taken.
> Remove that cond_resched() from console_unlock(). Maybe
> even call console_trylock()/console_unlock() with
> disabled preemtion again. It will increase the chance
> that anyone else will continue handling the console.
console_unlock() should run with the preemption disable, yes.
> + keep the code to force sync mode in known emergency
> situations (halt, suspend, ...).
>
>
> This way we should be good in all situations:
>
> + sudden death because we are in sync mode until atomic_limit
> is reached
what if sudden death happens right after wake_up(printk_kthread)?
we can't just leave console_unlock().
> + flood of messages because printk() does not sleep with
> console_lock taken. Either someone is flushing console
> or any printk() call could continue flushing the console.
>
> + critical situations because we force the sync mode
> explicitely
-ss