Re: [PATCH 2/2] rcu-tasks: add RCU-tasks self tests

From: Paul E. McKenney
Date: Tue Feb 16 2021 - 12:31:58 EST


On Mon, Feb 15, 2021 at 12:28:26PM +0100, Sebastian Andrzej Siewior wrote:
> On 2021-02-13 08:45:54 [-0800], Paul E. McKenney wrote:
> > Glad you like it! But let's see which (if any) of these patches solves
> > the problem for Sebastian.
>
> Looking at that, is there any reason for doing this that can not be
> solved by moving the self-test a little later? Maybe once we reached at
> least SYSTEM_SCHEDULING?

One problem is that ksoftirqd and the kprobes use are early_initcall(),
so we cannot count on ksoftirqd being spawned when kprobes first uses
synchronize_rcu_tasks(). Moving the selftest later won't fix this
problem, but rather just paper it over.

> This happens now even before lockdep is up or the console is registered.
> So if something bad happens, you end up with a blank terminal.

I was getting a splat, but I could easily believe that there are
configurations where the hang is totally silent. In other words, I do
agree that this needs a proper fix. All we need do is work out an
agreeable value of "proper". ;-)

> There is nothing else that early in the boot process that requires
> working softirq. The only exception to this is wait_task_inactive()
> which is used while starting a new thread (including the ksoftirqd)
> which is why it was moved to schedule_hrtimeout().

Moving kprobes initialization to early_initcall() [1] means that there
can be a call to synchronize_rcu_tasks() before the current spawning of
ksoftirqd. Because synchronize_rcu_tasks() needs timers to work, it needs
softirq to work. I know two straightforward ways to make that happen:

1. Spawn ksoftirqd earlier.

2. Suppress attempts to awaken ksoftirqd before it exists,
forcing all ksoftirq execution on the back of interrupts.

Uladzislau and I each produced patches for #1, and I produced a patch
for #2.

The only other option I know of is to push the call to init_kprobes()
later in the boot sequence, perhaps to its original subsys_initcall(),
or maybe only as late as core_initcall(). I added Masami and Steve on
CC for their thoughts on this.

Is there some other proper fix that I am missing?

Thanx, Paul

[1] 36dadef23fcc ("kprobes: Init kprobes in early_initcall")