Re: [PATCH 2/3] sched/deadline: fix bandwidth check/update when migrating tasks between exclusive cpusets

From: Juri Lelli
Date: Tue Oct 07 2014 - 04:59:27 EST


Hi Peter,

On 19/09/14 22:25, Peter Zijlstra wrote:
> On Fri, Sep 19, 2014 at 10:22:40AM +0100, Juri Lelli wrote:
>> Exclusive cpusets are the only way users can restrict SCHED_DEADLINE tasks
>> affinity (performing what is commonly called clustered scheduling).
>> Unfortunately, such thing is currently broken for two reasons:
>>
>> - No check is performed when the user tries to attach a task to
>> an exlusive cpuset (recall that exclusive cpusets have an
>> associated maximum allowed bandwidth).
>>
>> - Bandwidths of source and destination cpusets are not correctly
>> updated after a task is migrated between them.
>>
>> This patch fixes both things at once, as they are opposite faces
>> of the same coin.
>>
>> The check is performed in cpuset_can_attach(), as there aren't any
>> points of failure after that function. The updated is split in two
>> halves. We first reserve bandwidth in the destination cpuset, after
>> we pass the check in cpuset_can_attach(). And we then release
>> bandwidth from the source cpuset when the task's affinity is
>> actually changed. Even if there can be time windows when sched_setattr()
>> may erroneously fail in the source cpuset, we are fine with it, as
>> we can't perfom an atomic update of both cpusets at once.
>
> The thing I cannot find is if we correctly deal with updates to the
> cpuset. Say we first setup 2 (exclusive) sets A:cpu0 B:cpu1-3. Then
> assign tasks and then update the cpu masks like: B:cpu2,3, A:cpu1,2.
>

So, what follows should address the problem you describe.

Assuming you intended that we try to update masks as A:cpu0,3 and
B:cpu1,2, with what below we are able to check that removing cpu3
from B doesn't break guarantees. After that cpu3 can be put in A.

Does it make any sense?

Thanks,

- Juri