Re: [PATCH] rcu-tasks: Delay rcu_tasks_verify_self_tests() to avoid missed callbacks

From: Waiman Long
Date: Mon Jun 13 2022 - 16:17:39 EST


On 6/13/22 13:56, Paul E. McKenney wrote:
On Mon, Jun 13, 2022 at 12:01:24PM -0400, Waiman Long wrote:
On 6/10/22 16:58, Paul E. McKenney wrote:
On Fri, Jun 10, 2022 at 02:42:12PM -0400, Waiman Long wrote:
Even though rcu_tasks selftest is initiated early in the boot process,
the verification done at late initcall time may not be late enough to
catch all the callbacks especially on systems with just a few cpus and
small memory.

After 12 bootup's On a s390x system, 1 of them had failed rcu_tasks
verification test.

[ 8.183013] call_rcu_tasks() has been failed.
[ 8.183041] WARNING: CPU: 0 PID: 1 at kernel/rcu/tasks.h:1696 rcu_tasks_verify_self_tests+0x64/0xd0
[ 8.203246] Callback from call_rcu_tasks() invoked.

In this particular case, the callback missed the check by about
20ms. Similar rcu_tasks selftest failures are also seen in ppc64le
systems.

[ 0.313391] call_rcu_tasks() has been failed.
[ 0.313407] WARNING: CPU: 0 PID: 1 at kernel/rcu/tasks.h:1696 rcu_tasks_verify_self_tests+0x5c/0xa0
[ 0.335569] Callback from call_rcu_tasks() invoked.

Avoid this missed callback by delaying the verification using
delayed_work. The delay is set to be about 0.1s which hopefully will
be long enough to catch all the callbacks on systems with few cpus and
small memory.

Fixes: bfba7ed084f8 ("rcu-tasks: Add RCU-tasks self tests")
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
Good catch, thank you!

A few days ago, I queued this:

2585014188d5 ("rcu-tasks: Be more patient for RCU Tasks boot-time testing")

This is shown in full at the end of this email. Does this fix this
problem for you?
I think your patch should fix the false positive warning and it give plenty
of time for this to happen.

I do have one question though. rcu_tasks_verify_selft_tests() is called from
do_initcalls(). Since it may not be the last late initcall, does that mean
other late initcalls queued after that may be delayed by a second or more?
Indeed. Which is why I would welcome the workqueues portion of your
patch on top of the above patch in -rcu. ;-)

Sure. I will work on such a follow-up patch.

Cheers,
Longman