Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

From: Paul E. McKenney
Date: Thu Apr 03 2014 - 13:42:12 EST


On Thu, Apr 03, 2014 at 05:33:58PM +0000, Mathieu Desnoyers wrote:
> ----- Original Message -----
> > From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> > To: "Mathieu Desnoyers" <mathieu.desnoyers@xxxxxxxxxxxx>
> > Cc: fweisbec@xxxxxxxxx, peterz@xxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
> > Sent: Thursday, April 3, 2014 1:21:58 PM
> > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> >
> > On Thu, Apr 03, 2014 at 02:57:55AM +0000, Mathieu Desnoyers wrote:
> > > ----- Original Message -----
> > > > From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> > > > To: "Mathieu Desnoyers" <mathieu.desnoyers@xxxxxxxxxxxx>
> > > > Cc: fweisbec@xxxxxxxxx, peterz@xxxxxxxxxxxxx,
> > > > linux-kernel@xxxxxxxxxxxxxxx
> > > > Sent: Tuesday, April 1, 2014 10:50:44 PM
> > > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > >
> > > [...]
> > > > > Here is the experiment I did on this latest version: I just enabled the
> > > > > safety checking (not progress). I commented these lines out (again):
> > > > >
> > > > > /* If needed, wake up the timekeeper. */
> > > > > if
> > > > > //:: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > > > > // wakeup_timekeeper = 1;
> > > > > :: else -> skip;
> > > > > fi;
> > > > >
> > > > > compile and run with:
> > > > >
> > > > > spin -a sysidle.spin
> > > > > gcc -o pan pan.c
> > > > > ./pan -m1280000 -w22
> > > > >
> > > > > My expectation would be to trigger some kind of assertion that the
> > > > > timekeeper is not active while the worker is running. Unfortunately,
> > > > > nothing triggers.
> > > >
> > > > That is expected behavior. Failure to wake up the timekeeper is set
> > > > up as a progress criterion.
> > > >
> > > > > I see 3 possible solutions: we could either add assertions into
> > > > > other branches of the timekeeper, or add assertions into the worker
> > > > > thread. Another way to do it would be to express the assertions as
> > > > > negation of a LTL formula based on state variables.
> > > >
> > > > I did try both a hand-coded "never" clause and LTL formulas without
> > > > success. You have more experience with them, so perhaps you could make
> > > > something work. The need is that if all worker threads go non-idle
> > > > indefinitely starting from the RCU_SYSIDLE_FULL_NOTED state, the
> > > > wakeup_timekeeper must eventually be set to 1, and then the timekeeper
> > > > must start running, for example, the wakeup_timekeeper value must revert
> > > > to zero.
> > > >
> > > > The problem I had is that the "never" claims seem set up to catch some
> > > > finite behavior after a possibly-infinite prefix. In this case, we
> > > > instead need to catch some infinite behavior after a possibly-infinite
> > > > prefix.
> > > >
> > > > > Thoughts ?
> > > >
> > > > Me, I eventually switched over to using progress detection. Which might
> > > > be a bit unconventional, but it did have the virtue of making the
> > > > model complain when it should complain. ;-)
> > >
> > > Here is my attempt at simplifying the model. I use LTL formulas as checks
> > > for the two things I think really matter here: having timekeeping
> > > eventually
> > > active when threads are running, and having timekeeping eventually inactive
> > > when threads are idle. Hopefully my Promela is not too rusty:
> >
> > Well, this was the first time I had ever tried using LTL or never claims,
> > so you are ahead of me either way. ;-)
> >
> > > 1) When, at any point in the trail, a worker is setup, then we want to
> > > be sure that at some point in the future the timer is necessarily
> > > running:
> > >
> > > timer_active.ltl:
> > > [] (am_setup_0 -> (<> timekeeper_is_running))
> >
> > OK, am_setup_0 implies that the timekeeper will eventually be running.
> > For two threads, this presumably becomes:
> >
> > [] ((am_setup_0 || am_setup_1) -> (<> timekeeper_is_running))
>
> Yes.
>
> >
> > > 2) When, at any point in the trail, a worker is not setup, then we
> > > want to be sure that at some point in the future, either this
> > > thread is setup again, or the timekeeper reaches a not running
> > > state:
> > >
> > > timer_inactive.ltl:
> > > [] (!am_setup_0 -> (<> (!timekeeper_is_running || am_setup_0)))
> >
> > And here the two-thread version should be something like this:
> >
> > [] (!(am_setup_0 || am_setup_1) -> (<> (!timekeeper_is_running || am_setup_0
> > || am_setup_1)))
>
> Yes, I think. Should be double-checked and tested though.

I did test all four options in the sysidle.sh file with both the original
and two-thread LTL claims, and got the expected results. Need something
to force failures in the other LTL formula, of course.

> > It would be nice if never claims allowed local variables, as that would
> > allow looping over the am_setup array. Or maybe I just haven't figured
> > out how to tell spin that a given variable should not be considered to
> > be part of the global state.
>
> AFAIU, we're very limited in what we can put in LTL. You could generate
> a never claim by hand (see the generated pan.ltl as an example), and maybe
> have more luck.

Any time I tried to assign to anything from within a "never" claim,
it yelled at me. ;-)

> > > sysidle.sh:
> > > #!/bin/sh
> > >
> > > spin -f "!(`cat timer_active.ltl`)" > pan.ltl
> > > #spin -f "!(`cat timer_inactive.ltl`)" > pan.ltl
> > > spin -a -X -N pan.ltl sysidle.spin
> > > #spin -DINJECT_MISSING_WAKEUP -a -X -N pan.ltl sysidle.spin
> >
> > I definitely didn't use "-X". The "spin --help" says "-[XYZ] reserved
> > for use by xspin interface", so maybe it doesn't matter. ;-)
>
> Hopefully. I grabbed the "-X" option from the urcu models. I don't
> remember why we have it there.

Ah! ;-)

> > > gcc -o pan pan.c
> > > ./pan -f -a -m1280000 -w22
> > >
> > > #view trail with:
> > > # spin -v -t -N pan.ltl sysidle.spin
> >
> > Interesting. I have put this into a separate branch. May I use your
> > Signed-off-by?
>
> Yes, of course.

Very good, thank you!

> > I need to play around a bit more with LTL and progress labels!
>
> For progress, you'll want to add progress labels within the model, and
> do a LTL formula checking the _np special variable.

I did use the progress labels, but used -DNP rather than the LTL _np
special variable.

Thanx, Paul

> Thanks,
>
> Mathieu
>
>
> >
> > Thanx, Paul
> >
> > > sysidle.spin:
> > >
> > > /*
> > > * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
> > > * This model assumes that the dyntick-idle bit manipulation works based
> > > * on long usage, and substitutes a per-thread boolean "am_busy[]" array
> > > * for the Linux kernel's dyntick-idle masks. The focus of this model
> > > * is therefore on the state machine itself. Models timekeeper "running"
> > > * state with respect to worker thread idle state.
> > > *
> > > * This program is free software; you can redistribute it and/or modify
> > > * it under the terms of the GNU General Public License as published by
> > > * the Free Software Foundation; either version 2 of the License, or
> > > * (at your option) any later version.
> > > *
> > > * This program is distributed in the hope that it will be useful,
> > > * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > > * GNU General Public License for more details.
> > > *
> > > * You should have received a copy of the GNU General Public License
> > > * along with this program; if not, you can access it online at
> > > * http://www.gnu.org/licenses/gpl-2.0.html.
> > > *
> > > * Copyright IBM Corporation, 2014
> > > * Copyright EfficiOS, 2014
> > > *
> > > * Author: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > > * Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> > > */
> > >
> > > // adapt LTL formulas before changing NUM_WORKERS
> > > //#define NUM_WORKERS 2
> > > #define NUM_WORKERS 1
> > > /* Defines used because LTL formula interprets [] */
> > > #define am_setup_0 am_setup[0]
> > > #define am_setup_1 am_setup[1]
> > >
> > > byte wakeup_timekeeper = 0; /* Models rcu_kick_nohz_cpu(). */
> > >
> > > #define RCU_SYSIDLE_NOT 0 /* Some CPU is not idle. */
> > > #define RCU_SYSIDLE_SHORT 1 /* All CPUs idle for brief period.
> > > */
> > > #define RCU_SYSIDLE_LONG 2 /* All CPUs idle for long enough.
> > > */
> > > #define RCU_SYSIDLE_FULL 3 /* All CPUs idle, ready for
> > > sysidle. */
> > > #define RCU_SYSIDLE_FULL_NOTED 4 /* Actually entered sysidle state.
> > > */
> > >
> > > byte full_sysidle_state = RCU_SYSIDLE_NOT;
> > >
> > > byte am_busy[NUM_WORKERS]; /* Busy is similar to "not dyntick-idle". */
> > > byte am_setup[NUM_WORKERS]; /* Setup means timekeeper knows I am not idle.
> > > */
> > >
> > > byte timekeeper_is_running = 1; /* Timekeeper initially running */
> > >
> > > /*
> > > * Non-timekeeping CPU going into and out of dyntick-idle state.
> > > */
> > > proctype worker(byte me)
> > > {
> > > byte oldstate;
> > >
> > > do
> > > :: 1 ->
> > > /* Go idle. */
> > > am_setup[me] = 0;
> > > am_busy[me] = 0;
> > >
> > > /* Dyntick-idle in the following loop. */
> > > do
> > > :: 1 -> skip;
> > > :: 1 -> break;
> > > od;
> > >
> > > /* Exit idle loop, model getting out of dyntick idle state.
> > > */
> > > am_busy[me] = 1;
> > >
> > > /* Get state out of full-system idle states. */
> > > atomic {
> > > oldstate = full_sysidle_state;
> > > if
> > > :: oldstate > RCU_SYSIDLE_SHORT ->
> > > full_sysidle_state = RCU_SYSIDLE_NOT;
> > > :: else -> skip;
> > > fi;
> > > }
> > >
> > > /* If needed, wake up the timekeeper. */
> > > if
> > > :: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > > #ifndef INJECT_MISSING_WAKEUP
> > > wakeup_timekeeper = 1;
> > > #else
> > > skip;
> > > #endif
> > > :: else -> skip;
> > > fi;
> > >
> > > /* Mark ourselves fully awake and operational. */
> > > am_setup[me] = 1;
> > >
> > > /* We are fully awake, so timekeeper must not be asleep. */
> > > assert(full_sysidle_state < RCU_SYSIDLE_FULL);
> > >
> > > /* Running in kernel in the following loop. */
> > > do
> > > :: 1 -> skip;
> > > :: 1 -> break;
> > > od;
> > > od
> > > }
> > >
> > > /*
> > > * Are all the workers in dyntick-idle state?
> > > */
> > > #define check_idle() \
> > > i = 0; \
> > > idle = 1; \
> > > do \
> > > :: i < NUM_WORKERS -> \
> > > if \
> > > :: am_busy[i] == 1 -> idle = 0; \
> > > :: else -> skip; \
> > > fi; \
> > > i++; \
> > > :: i >= NUM_WORKERS -> break; \
> > > od
> > >
> > > /*
> > > * Timekeeping CPU.
> > > */
> > > proctype timekeeper()
> > > {
> > > byte i;
> > > byte idle;
> > > byte curstate;
> > > byte newstate;
> > >
> > > do
> > > :: 1 ->
> > > /* Capture current state. */
> > > check_idle();
> > > curstate = full_sysidle_state;
> > > newstate = curstate;
> > >
> > > /* Manage state... */
> > > if
> > > :: idle == 1 && curstate < RCU_SYSIDLE_FULL_NOTED ->
> > > /* Idle, advance to next state. */
> > > atomic {
> > > if
> > > :: full_sysidle_state == curstate ->
> > > newstate = curstate + 1;
> > > full_sysidle_state = newstate;
> > > :: else -> skip;
> > > fi;
> > > }
> > > :: idle == 0 && full_sysidle_state >= RCU_SYSIDLE_LONG ->
> > > /* Non-idle and state advanced, revert to base
> > > state. */
> > > full_sysidle_state = RCU_SYSIDLE_NOT;
> > > :: else -> skip;
> > > fi;
> > >
> > > /* If in RCU_SYSIDLE_FULL_NOTED, wait to be awakened. */
> > > if
> > > :: newstate != RCU_SYSIDLE_FULL_NOTED ->
> > > skip;
> > > :: newstate == RCU_SYSIDLE_FULL_NOTED ->
> > > timekeeper_is_running = 0;
> > > do
> > > :: wakeup_timekeeper == 0 ->
> > > skip; /* Awaiting wake up */
> > > :: else ->
> > > timekeeper_is_running = 1;
> > > wakeup_timekeeper = 0;
> > > break; /* awakened */
> > > od;
> > > fi;
> > > assert(full_sysidle_state <= RCU_SYSIDLE_FULL_NOTED);
> > > od;
> > > }
> > >
> > > init {
> > > byte i = 0;
> > >
> > > do
> > > :: i < NUM_WORKERS ->
> > > am_busy[i] = 1;
> > > am_setup[i] = 1;
> > > run worker(i);
> > > i++;
> > > :: i >= NUM_WORKERS -> break;
> > > od;
> > > run timekeeper();
> > > }
> > >
> > > --
> > > Mathieu Desnoyers
> > > EfficiOS Inc.
> > > http://www.efficios.com
> > >
> >
> >
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/