Re: [PATCH 1/3] sched/fair: Disable runtime_enabled on dying rq

From: Kirill Tkhai
Date: Mon Jun 23 2014 - 16:49:53 EST


On 23.06.2014 21:29, bsegall@xxxxxxxxxx wrote:
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
>
>> On Tue, Jun 17, 2014 at 05:24:10PM +0400, Kirill Tkhai wrote:
>>> @@ -3790,6 +3803,12 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
>>> cfs_rq->runtime_remaining = 1;
>>> if (cfs_rq_throttled(cfs_rq))
>>> unthrottle_cfs_rq(cfs_rq);
>>> +
>>> + /*
>>> + * Offline rq is schedulable till cpu is completely disabled
>>> + * in take_cpu_down(), so we prevent new cfs throttling here.
>>> + */
>>> + cfs_rq->runtime_enabled = 0;
>>
>> Does it make sense to clear this before calling unthrottle_cfs_rq()?
>> Just to make sure they're in the right order..
>
> I believe that order is irrelevant here - I do not believe that
> unthrottle_cfs_rq(a) can cause a throttle_cfs_rq(a). In fact, I don't
> see any code that will check it at all from unthrottle, although I might
> be missing something. It _can_ cause a throttle_cfs_rq(parent_cfs_rq(a)),
> but that should be fine as long as for_each_leaf_cfs_rq is sorted
> correctly.

I think this is correct. We may change the order just for the hope, that
anybody will work on it in some way in the future, and this person could
skip this opaque place. Ok, I don't know how is better :)

> That said, migrate_tasks drops rq->lock, and I /think/ another cpu could
> wake another task onto this cpu, which could then enqueue_throttle its
> cfs_rq (which previously had no tasks and thus wasn't on
> leaf_cfs_rq_list). You certainly could have tg_set_bandwidth come in and
> turn runtime_enabled on.

We mask cpu inactive on CPU_DOWN_PREPARE stage (in sched_cpu_inactive).
Other cpu is not able to wake a task there after that.

rq is masked offline in cpuset_cpu_inactive() (during the same stage).
But priority of sched_cpu_inactive() is higher than priority of
cpuset_cpu_inactive().

CPU_PRI_SCHED_INACTIVE = INT_MIN + 1,
CPU_PRI_CPUSET_INACTIVE = INT_MIN,

This guarantees that nobody could use dying_cpu even before we start
unthrottling. Another cpu will see dying_cpu is inactive.

So, it looks like we are free of this problem.

> I think the general idea of turning runtime_enabled off during offline
> is fine, ccing pjt to double check.

Thanks,
Kirill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/