Re: [PATCH v3 1/2] rcutorture: Perform more frequent testing of ->gpwrap
From: Paul E. McKenney
Date: Tue Apr 15 2025 - 20:19:26 EST
On Mon, Apr 14, 2025 at 11:05:45AM -0400, Joel Fernandes wrote:
> On 4/10/2025 2:29 PM, Paul E. McKenney wrote:
> >> +static int rcu_gpwrap_lag_init(void)
> >> +{
> >> + if (gpwrap_lag_cycle_mins <= 0 || gpwrap_lag_active_mins <= 0) {
> >> + pr_alert("rcu-torture: lag timing parameters must be positive\n");
> >> + return -EINVAL;
> > When rcutorture is initiated by modprobe, this makes perfect sense.
> >
> > But if rcutorture is built in, we have other choices: (1) Disable gpwrap
> > testing and do other testing but splat so that the bogus scripting can
> > be fixed, (2) Force default values and splat as before, (3) Splat and
> > halt the system.
> >
> > The usual approach has been #1, but what makes sense in this case?
>
> If the user deliberately tries to prevent the test, I am Ok with #3 which I
> believe is the current behavior. But otherwise #1 is also Ok with me but I don't
> feel strongly about doing that.
>
> If we want to do #3, it will just involve changing the "return -EINVAL" to
> "return 0" but also may need to be doing so only if RCU torture is a built-in.
>
> IMO the current behavior is reasonable than adding more complexity for an
> unusual case for a built-in?
The danger is that someone adjusts a scenario, accidentally disables
*all* ->gpwrap testing during built-in tests (kvm.sh, kvm-remote,sh,
and torture.sh), and nobody notices. This has tripped me up in the
past, hence the existing splats in rcutorture, but only for runs with
built-in rcutorture.
> On the other hand if the issue is with providing the user with a way to disable
> gpwrap testing, that should IMO be another parameter than setting the _mins
> parameters to be 0. But I think we may not want this testing disabled since it
> is already "self-disabled" for the first 25 miutes.
We do need a way of disabling the testing on long runs for fault-isolation
purposes.
For example, rcutorture.n_up_down=0 disables SRCU up/down testing.
Speaking of which, I am adding a section on that topic to this document:
https://docs.google.com/document/d/1RoYRrTsabdeTXcldzpoMnpmmCjGbJNWtDXN6ZNr_4H8/edit?usp=sharing
Thanx, Paul