Re: guarantee forward progress: was: Re: [PATCH printk v2 11/12] printk: extend console_lock for proper kthread support
From: John Ogness
Date: Fri Apr 08 2022 - 16:17:25 EST
On 2022-04-08, Petr Mladek <pmladek@xxxxxxxx> wrote:
> I played a lot with it and it is really hard because:
>
> + new messages can appear anytime
> + direct mode might get requested anytime
> + only the direct mode knows whether all messages were flushed
> on all consoles
Yes, and this is why v1 dramatically simplified the picture by making
kthreads not care about direct mode. In v1 the kthread logic is very
simple: If there are messages to print, try to print them no matter
what. We didn't need to worry if someone was printing, because we knew
that at least the kthread was always printing.
This meant that there would be times when direct mode is active but the
kthreads are doing the printing. But in my experimenting, that tends to
be the case anyway, even with this more complex v2 approach. The reason
is that if some code does:
printk_prefer_direct_enter();
(100 lines of printk calls)
printk_prefer_direct_exit();
And directly before that printk_prefer_direct_enter() _any_ kthread was
already inside call_console_driver(), then _all_ the console_trylock()
calls of the above 100 printk's will fail. Inserting messages into the
ringbuffer is fast and any active printer will not have finished
printing its message before the above code snippet is done.
In fact, the above snippet will only do direct printing if there were
previously no unflushed messages. That is true for v1 (by design) and v2
(by misfortune, because ringbuffer insertion is much faster than a
single call_console_driver() call).
This new idea (v2) of trying to stop kthreads in order to "step aside"
for direct printing is really just adding a lot of complexity, a lot of
irqwork calls, and a lot of races. And with my experimenting I am not
seeing any gain, except for new risks of nobody printing.
I understand that when we say printk_prefer_direct_enter() that we
_really_ want to do direct printing. But we cannot force it if any
printer is already inside call_console_driver(). In that case, direct
printing simply will not and cannot happen.
For v3 I recommend going back to the v1 model, where kthreads do not
care if direct mode is preferred. I claim that v2 does yield any more
actual direct printing than v1 did.
However, I would keep the v2 change that kthreads go into their
wait_event check after every message. That at least provides earlier
responses for kthreads to stop themselves if they are disabled.
Once we have atomic consoles, things will look different. Then we
perform true synchronous direct printing. But without them, the "prefer"
in printk_prefer_direct_enter() is only a preference that can only be
satisfied under ideal situations (i.e. no kthread is inside
call_console_driver()).
John