Re: [patch RFC 00/29] printk: A new approach - WIP

From: Paul E. McKenney
Date: Sun Sep 11 2022 - 05:02:37 EST

Next message: Nathan Chancellor: "[PATCH net-next v2] net/mlx5e: Ensure macsec_rule is always initiailized in macsec_fs_{r,t}x_add_rule()"
Previous message: cgel . zte: "[PATCH] ipv4: add documentation of sysctls of error_cost and error_burst"
In reply to: Thomas Gleixner: "Re: [patch RFC 00/29] printk: A new approach - WIP"
Next in thread: Linus Torvalds: "Re: [patch RFC 00/29] printk: A new approach - WIP"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sun, Sep 11, 2022 at 12:27:31AM +0200, Thomas Gleixner wrote:

[ . . . ]

> The infrastructure we implemented comes with the following properties:
>
> - It shares the console list, but only for registration/unregistration
> purposes. Consoles which are utilizing the new infrastructure are
> ignored by the existing mechanisms and vice versa. This allows to
> reuse as much code as possible and preserves the printk semantics
> except for the threaded printing part.
>
> - The console list walk becomes SRCU protected to avoid any restrictions
> on contexts

I am guessing that this means that you need an NMI-safe srcu_read_lock()
and srcu_read_unlock(). If my guess is correct, please let me know,
and I will create one for you. (As it stands, these are NMI-safe on x86,
but not on architectures using the asm-generic variant of this_cpu_inc().

The result would be srcu_read_lock_nmi() and srcu_read_unlock_nmi()
or similar. There would need to be something to prevent mixing of
srcu_read_lock() and srcu_read_lock_nmi().

Or are you somehow avoiding ever invoking either srcu_read_lock() or
srcu_read_unlock() from NMI context?

For example, are we simply living dangerously with NMI-based stack traces
as suggested below? ;-)

> - Consoles become stateful to handle handover and takeover in a graceful
> way.
>
> - All console state operations rely solely on atomic*_try_cmpxchg() so
> they work in any context.
>
> - Console locking is per console to allow friendly handover or "safe"
> hostile takeover in emergency/panic situations. Console lock is not
> relevant for consoles converted to the new infrastructure.
>
> - The core provides interfaces for console drivers to query whether they
> can proceed and to denote 'unsafe' sections in the console state, which
> is unavoidable for some console drivers.
>
> In fact there is not a single regular (non-early) console driver today
> which is reentrancy safe under all circumstances, which enforces that
> NMI context is excluded from printing directly. TBH, that's a sad state
> of affairs.
>
> The unsafe state bit allows to make informed decisions in the core
> code, e.g. to postpone printing if there are consoles available which
> are safe to acquire. In case of a hostile takeover the unsafe state bit
> is handed to the atomic write callback so that the console driver can
> act accordingly.
>
> - Printing is delegated to a per console printer thread except for the
> following situations:
>
> - Early boot
> - Emergency printing (WARN/OOPS)
> - Panic printing
>
> The integration is complete, but without any fancy things, like locking all
> consoles when entering a WARN, print and unlock when done. Such things only
> make sense once all drivers are converted over because that conflicts with
> the way how the existing console lock mechanics work.
>
> For testing we used the most simple driver: a hacked up version of the
> early uart8250 console as we wanted to concentrate on validating the core
> mechanisms of friendly handover and hostile takeovers instead of dealing
> with the horrors of port locks or whatever at the same time. That's the
> next challenge. Hack patch will be provided in a reply.
>
> Here is sample output where we let the atomic and thread write functions
> prepend each line with the printing context (A=atomic, T=thread):
>
> A[ 0.394066] ... fixed-purpose events: 3
> A[ 0.395130] ... event mask: 000000070000000f
>
> End of early boot, thread starts
>
> TA[ 0.396821] rcu: Hierarchical SRCU implementation.
>
> ^^ Thread starts printing and immediately raises a warning, so atomic
> context at the emergency priority takes over and continues printing.
>
> This is a forceful, but safe takeover scenario as the WARN context
> is obviously on the same CPU as the printing thread where friendly
> is not an option.
>
> A[ 0.397133] rcu: Max phase no-delay instances is 400.
> A[ 0.397640] ------------[ cut here ]------------
> A[ 0.398072] WARNING: CPU: 0 PID: 13 at drivers/tty/serial/8250/8250_early.c:123 __early_serial8250_write.isra.0+0x80/0xa0
> ....
> A[ 0.440131] ret_from_fork+0x1f/0x30
> A[ 0.441133] </TASK>
> A[ 0.441867] ---[ end trace 0000000000000000 ]---
> T[ 0.443493] smp: Bringing up secondary CPUs ...
>
> After the warning the thread continues printing.
>
> ....
>
> T[ 1.916873] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
> T[ 1.918719] md: Waiting for all devices to be available before autod
>
> A[ 1.918719] md: Waiting for all devices to be available before autodetect
>
> System panics because it can't find a root file system. Panic printing
> takes over the console from the printer thread immediately and reprints
> the interrupted line.
>
> This case is a friendly handover from the printing thread to the panic
> context because the printing thread was not running on the panic CPU, but
> handed over gracefully.
>
> A[ 1.919942] md: If you don't use raid, use raid=noautodetect
> A[ 1.921030] md: Autodetecting RAID arrays.
> A[ 1.921919] md: autorun ...
> A[ 1.922686] md: ... autorun DONE.
> A[ 1.923761] /dev/root: Can't open blockdev
>
> So far the implemented state handling machinery holds up on the various
> handover and hostile takeover situations we enforced for testing.
>
> Hostile takeover is nevertheless a special case. If the driver is in an
> unsafe region that's something which needs to be dealt with per driver.
> There is not much the core code can do here except of trying a friendly
> handover first and only enforcing it after a timeout or not trying to print
> on such consoles.
>
> This needs some thought, but we explicitely did not implement any takeover
> policy into the core state handling mechanism as this is really a decision
> which needs to be made at the call site. See patch 28.
>
> We are soliciting feedback on that approach and we hope that we can
> organize a BOF in Dublin on short notice.

Last I checked, there were still some slots open.

> The series is also available from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git printk
>
> The series has the following parts:
>
> Patches 1 - 5: Cleanups
>
> Patches 6 - 12: Locking and list conversion
>
> Patches 13 - 18: Improved output buffer handling to prepare for
> code sharing
>
> Patches 19 - 29: New infrastructure implementation
>
> Most of the preparatory patches 1-18 have probably a value on their own.
>
> Don't be scared about the patch stats below. We added kernel doc and
> extensive comments to the code:
>
> kernel/printk/printk_nobkl.c: Code: 668 lines, Comments: 697 lines, Ratio: 1:1.043

When I reach stable AC, I will fire off some rcutorture tests. From the
discussion above, it sounds like I should expect some boot-time warnings?
Or were those strictly printk-specific testing?

> Of course the code is trivial and straight forward as any other facility
> which has to deal with concurrency and the twist of being safe in any
> context. :)

;-) ;-) ;-)

Thanx, Paul

> Comments welcome.
>
> Thanks,
>
> tglx
> ---
> [1] https://lore.kernel.org/lkml/87r11qp63n.ffs@tglx/
> ---
> arch/parisc/include/asm/pdc.h | 2
> arch/parisc/kernel/pdc_cons.c | 53 -
> arch/parisc/kernel/traps.c | 17
> b/kernel/printk/printk_nobkl.c | 1564 +++++++++++++++++++++++++++++++++++++++++
> drivers/tty/serial/kgdboc.c | 7
> drivers/tty/tty_io.c | 6
> fs/proc/consoles.c | 12
> fs/proc/kmsg.c | 2
> include/linux/console.h | 375 +++++++++
> include/linux/printk.h | 9
> include/linux/syslog.h | 3
> kernel/debug/kdb/kdb_io.c | 7
> kernel/panic.c | 12
> kernel/printk/printk.c | 485 ++++++++----
> 14 files changed, 2304 insertions(+), 250 deletions(-)

Next message: Nathan Chancellor: "[PATCH net-next v2] net/mlx5e: Ensure macsec_rule is always initiailized in macsec_fs_{r,t}x_add_rule()"
Previous message: cgel . zte: "[PATCH] ipv4: add documentation of sysctls of error_cost and error_burst"
In reply to: Thomas Gleixner: "Re: [patch RFC 00/29] printk: A new approach - WIP"
Next in thread: Linus Torvalds: "Re: [patch RFC 00/29] printk: A new approach - WIP"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]