Re: [PATCH v5 16/18] timer: Implement the hierarchical pull model

From: Frederic Weisbecker
Date: Tue Mar 14 2023 - 12:01:47 EST


Le Tue, Mar 14, 2023 at 03:49:38PM +0100, Anna-Maria Behnsen a écrit :
> On Tue, 14 Mar 2023, Frederic Weisbecker wrote:
>
> > Le Wed, Mar 01, 2023 at 03:17:42PM +0100, Anna-Maria Behnsen a écrit :
> > > diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c
> > > new file mode 100644
> > > index 000000000000..5a600de3623b
> > > --- /dev/null
> > > +++ b/kernel/time/timer_migration.c
> > > +static bool tmigr_active_up(struct tmigr_group *group,
> > > + struct tmigr_group *child,
> > > + void *ptr)
> > > +{
> > > + union tmigr_state curstate, newstate;
> > > + struct tmigr_walk *data = ptr;
> > > + bool walk_done;
> > > + u32 childmask;
> > > +
> > > + childmask = data->childmask;
> > > + newstate = curstate = data->groupstate;
> > > +
> > > +retry:
> > > + walk_done = true;
> > > +
> > > + if (newstate.migrator == TMIGR_NONE) {
> > > + newstate.migrator = (u8)childmask;
> > > +
> > > + /* Changes need to be propagated */
> > > + walk_done = false;
> > > + }
> > > +
> > > + newstate.active |= (u8)childmask;
> > > +
> > > + newstate.seq++;
> >
> > Are you sure you need this seq counter? What bad scenario can happen without it?
> >
>
> Without the seq counter we might get an inconsistent overall state of the
> groups when the order of propagating two changes of the child to the parent
> changes. To clarify what I mean, let me give you an example what happens
> without seqcount (maybe this should be described more detailed in the union
> tmigr_state description...).
>
> Let's take three groups and four CPUs (CPU2 and CPU3 as well as Group C
> will not change during the scenario):
>
> Group A
> migrator = Group B
> active = Group B, Group C
> / \
> Group B Group C
> migrator = CPU0 migrator = CPU2
> active = CPU0 active = CPU2
> / \ / \
> CPU0 CPU1 CPU2 CPU3
> active idle active idle
>
>
> 1. CPU0 goes idle (changes are updated in Group B; afterwards current
> states of Group B and Group A are stored in the data for walking the
> hierarchy):
>
> Group A
> migrator = Group B
> active = Group B, Group C
> / \
> Group B Group C
> -> migrator = TMIGR_NONE migrator = CPU2
> -> active = active = CPU2
> / \ / \
> CPU0 CPU1 CPU2 CPU3
> -> idle idle active idle
>
>
> 2. CPU1 comes out of idle (changes are update in Group B; afterwards
> current states of Group B and Group A are stored in the data for walking
> the hierarchy):
>
> Group A
> migrator = Group B
> active = Group B, Group C
> / \
> Group B Group C
> -> migrator = CPU1 migrator = CPU2
> -> active = CPU1 active = CPU2
> / \ / \
> CPU0 CPU1 CPU2 CPU3
> idle -> active active idle
>
>
> 3. Here comes the change of the order: Propagating the changes of
> 2. through the hierarchy to group A - nothing to be done, because Group
> B is already up to date.
>
> 4. Propagating the changes of 1. through the hierarchy to group A:
>
> Group A
> -> migrator = Group C
> -> active = Group C
> / \
> Group B Group C
> migrator = CPU1 migrator = CPU2
> active = CPU1 active = CPU2
> / \ / \
> CPU0 CPU1 CPU2 CPU3
> idle active active idle
>
> And now we have this inconsistent overall state..

Ooh I see now. Also that can't ever wrap up since you can only ever have no more than 8 racers
competing on a given node.

Tricky ;)