Re: [PATCH v3 1/2] rcutorture: Perform more frequent testing of ->gpwrap
From: Joel Fernandes
Date: Wed Apr 16 2025 - 07:19:37 EST
> On Apr 15, 2025, at 8:19 PM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> On Mon, Apr 14, 2025 at 11:05:45AM -0400, Joel Fernandes wrote:
>> On 4/10/2025 2:29 PM, Paul E. McKenney wrote:
>>>> +static int rcu_gpwrap_lag_init(void)
>>>> +{
>>>> + if (gpwrap_lag_cycle_mins <= 0 || gpwrap_lag_active_mins <= 0) {
>>>> + pr_alert("rcu-torture: lag timing parameters must be positive\n");
>>>> + return -EINVAL;
>>> When rcutorture is initiated by modprobe, this makes perfect sense.
>>>
>>> But if rcutorture is built in, we have other choices: (1) Disable gpwrap
>>> testing and do other testing but splat so that the bogus scripting can
>>> be fixed, (2) Force default values and splat as before, (3) Splat and
>>> halt the system.
>>>
>>> The usual approach has been #1, but what makes sense in this case?
>>
>> If the user deliberately tries to prevent the test, I am Ok with #3 which I
>> believe is the current behavior. But otherwise #1 is also Ok with me but I don't
>> feel strongly about doing that.
>>
>> If we want to do #3, it will just involve changing the "return -EINVAL" to
>> "return 0" but also may need to be doing so only if RCU torture is a built-in.
>>
>> IMO the current behavior is reasonable than adding more complexity for an
>> unusual case for a built-in?
>
> The danger is that someone adjusts a scenario, accidentally disables
> *all* ->gpwrap testing during built-in tests (kvm.sh, kvm-remote,sh,
> and torture.sh), and nobody notices. This has tripped me up in the
> past, hence the existing splats in rcutorture, but only for runs with
> built-in rcutorture.
But in the code we are discussing, we will splat with an error if either parameter is set to 0? Sorry if I missed something.
>
>> On the other hand if the issue is with providing the user with a way to disable
>> gpwrap testing, that should IMO be another parameter than setting the _mins
>> parameters to be 0. But I think we may not want this testing disabled since it
>> is already "self-disabled" for the first 25 miutes.
>
> We do need a way of disabling the testing on long runs for fault-isolation
> purposes.
Thanks, I will add an option for this.
>
> For example, rcutorture.n_up_down=0 disables SRCU up/down testing.
> Speaking of which, I am adding a section on that topic to this document:
>
> https://docs.google.com/document/d/1RoYRrTsabdeTXcldzpoMnpmmCjGbJNWtDXN6ZNr_4H8/edit?usp=sharing
Nice, thanks,
- Joel
>
> Thanx, Paul