Re: [PATCH rcu v2] 4/5] rcu-tasks: Move RCU Tasks self-tests to core_initcall()

From: Paul E. McKenney
Date: Thu Feb 06 2025 - 11:33:55 EST


On Thu, Feb 06, 2025 at 12:06:08PM +0100, Petr Mladek wrote:
> On Wed 2025-02-05 15:50:14, Paul E. McKenney wrote:
> > On Wed, Feb 05, 2025 at 11:37:25PM +0106, John Ogness wrote:
> > > On 2025-02-05, John Ogness <john.ogness@xxxxxxxxxxxxx> wrote:
> > > >> OK, so I don't need to add "if (IS_ENABLED(CONFIG_PREEMPT_RT))" to guard
> > > >> it, then?
> > > >
> > > > For !CONFIG_PREEMPT_RT, if there are legacy consoles registered,
> > > > pr_flush() will additionally perform a blocking lock on the
> > > > console_lock.
> > >
> > > Sorry, I was just thinking about the flushing component of
> > > pr_flush(). Later in the function it takes the console_lock even for
> > > CONFIG_PREEMPT_RT. So please do _not_ have a guard.

Thank you all for looking this over!

> It should be possible to avoid this console_lock() when only NBCON
> consoles are registered. We take it in pr_flush() for reading con->seq
> and potential race with register_console(). But it can be handled
> another way, for example, by restarting the cycle when non-NBCON
> console suddenly appears.
>
> But it might be done later. We do not have NBCON consoles at the
> moment so the console_lock() is needed anyway.

Fair enough.

> > > pr_flush() should never hang on the console_lock during shutdown, but if
> > > does, that is something that will need to be debugged and fixed.
> > >
> > > @pmladek: Looking forward to reading your input on this.
>
> My only concern is that kernel_power_off() is called in many
> code paths so it is not easy to review all scenarios.
>
> But I haven't found any code path where it would be an obvious problem.
> Especially, it seems that it is _not_ called in panic situations.
>
> Also kernel_power_off() calls syscore_shutdown() which takes
> syscore_ops_lock mutex. So that is must be called only in
> a schedulable context even before this patch.
>
> Summary: I do not see any obvious problem in the end.
> Except, for some small details, see below.

The presence of that syscore_shutdown() call greatly reduced my level
of anxiety as well. ;-)

> > Here is an updated commit, hopefully applying your feedback correctly.
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit 35679c18b062368855e183ee6712ca5c16145d8c
> > Author: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > Date: Wed Feb 5 12:27:23 2025 -0800
> >
> > printk: Flush console log from kernel_power_off()
> >
> > Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
> > and shutdown time, which hides shutdown-time RCU issues from rcutorture.
> > Therefore, make pr_flush() public and invoke it after then last print
> > in kernel_power_off().
> >
> > --- a/include/linux/printk.h
> > +++ b/include/linux/printk.h
> > @@ -207,6 +207,7 @@ void printk_legacy_allow_panic_sync(void);
> > extern bool nbcon_device_try_acquire(struct console *con);
> > extern void nbcon_device_release(struct console *con);
> > void nbcon_atomic_flush_unsafe(void);
> > +bool pr_flush(int timeout_ms, bool reset_on_progress);
> > #else
> > static inline __printf(1, 0)
> > int vprintk(const char *s, va_list args)
> > @@ -315,6 +316,11 @@ static inline void nbcon_atomic_flush_unsafe(void)
> > {
> > }
> >
> > +static bool pr_flush(int timeout_ms, bool reset_on_progress)
>
> There is missing "inline":
>
> static inline bool pr_flush(int timeout_ms, bool reset_on_progress)
>
>
> Otherwise, I get
>
> In file included from ./include/asm-generic/bug.h:22,
> from ./arch/x86/include/asm/bug.h:99,
> from ./include/linux/bug.h:5,
> from ./include/linux/page-flags.h:10,
> from kernel/bounds.c:10:
> ./include/linux/printk.h:319:13: warning: ‘pr_flush’ defined but not used [-Wunused-function]

Fixed, thank you both!

> > +{
> > + return true;
> > +}
> > +
> > #endif
> >
> > bool this_cpu_in_panic(void);
> > --- a/kernel/reboot.c
> > +++ b/kernel/reboot.c
> > @@ -704,6 +704,7 @@ void kernel_power_off(void)
> > migrate_to_reboot_cpu();
> > syscore_shutdown();
> > pr_emerg("Power down\n");
> > + pr_flush(1000, 1);
>
> This should be pr_flush(1000, true) as suggested by Sebastian.

Also fixed, also thank you both!

> > kmsg_dump(KMSG_DUMP_SHUTDOWN);
> > machine_power_off();
> > }
>
> With the two changes:
>
> Reviewed-by: Petr Mladek <pmladek@xxxxxxxx>
>
> Paul, I assume that you will pass this via the RCU tree together with
> the other changes. Or would you prefer to route this particular
> change via printk tree? Both ways are fine to me.

One advantage of carrying it via RCU is that I can arrange things so that
rcutorture users don't see this problem. So unless a problem arises,
thank you, and I do plan to send this up via RCU.

Thanx, Paul