Re: [patch V2 08/10] timer: Implement the hierarchical pull model
From: Peter Zijlstra
Date: Wed Apr 19 2017 - 06:23:14 EST
On Tue, Apr 18, 2017 at 01:11:10PM +0200, Thomas Gleixner wrote:
> +static u64 tmigr_set_cpu_inactive(struct tmigr_group *group,
> + struct tmigr_group *child,
> + struct tmigr_event *evt,
> + unsigned int cpu)
> +{
> + struct tmigr_group *parent;
> + u64 nextevt = KTIME_MAX;
> +
> + raw_spin_lock_nested(&group->lock, group->level);
> +
> + DBG_BUG_ON(!group->active);
> +
> + cpumask_clear_cpu(cpu, group->cpus);
> + group->active--;
> +
> + /*
> + * If @child is not NULL, then this is a recursive invocation to
> + * propagate the deactivation of @cpu. If @child has a new migrator
> + * set it active in @group.
> + */
> + if (child && child->migrator != TMIGR_NONE) {
> + cpumask_set_cpu(child->migrator, group->cpus);
> + group->active++;
And I'm confused...
If we retain child->migrator as 'active', should we then not also re-set
our own bit for that child group?
> + }
> +
> + /* Add @evt to @group */
> + tmigr_add_evt(group, evt);
> +
> + /* If @cpu is not the active migrator, everything is up to date */
> + if (group->migrator != cpu)
> + goto done;
At this point we have already cleared @cpu's bit in our group->cpus. Is
that right?
> + /* Update the migrator. */
> + if (!group->active)
> + group->migrator = TMIGR_NONE;
> + else
> + group->migrator = cpumask_first(group->cpus);
So here we could have changed ->migrator away from @cpu, no?
> +
> + parent = group->parent;
> + if (parent) {
> + /*
> + * @cpu was the migrator in @group, so it is marked as
> + * active in its parent group(s) as well. Propagate the
> + * migrator change.
> + */
So how is that then still valid? Because this seems to hinge on the
assumption that @cpu is the migrator.
> + evt = group->active ? NULL : &group->groupevt;
> + nextevt = tmigr_set_cpu_inactive(parent, group, evt, cpu);
In general I'm a wee bit confused on how this works. Do we at all times
retain a migrator per group, or only one per group that has activity,
which then reduces to 1 per system when the whole system idles.
I'll stare at this a bit more, but I feel a comment explaining things
wouldn't go amiss.