Re: [PATCH v6 13/21] sched: Admit forcefully-affined tasks into SCHED_DEADLINE

From: Juri Lelli
Date: Fri May 21 2021 - 10:04:32 EST


On 21/05/21 13:02, Quentin Perret wrote:

...

> So I think Will has a point since, IIRC, the root domains get rebuilt
> during hotplug. So you can imagine a case with a single root domain, but
> CPUs 4-7 are offline. In this case, sched_setattr() will happily promote
> a task to DL as long as its affinity mask is a superset of the rd span,
> but things may get ugly when CPUs are plugged back in later on.
>
> This looks like an existing bug though. I just tried the following on a
> system with 4 CPUs:
>
> // Create a task affined to CPU [0-2]
> > while true; do echo "Hi" > /dev/null; done &
> [1] 560
> > mypid=$!
> > taskset -p 7 $mypid
> pid 560's current affinity mask: f
> pid 560's new affinity mask: 7
>
> // Try to move it DL, this should fail because of the affinity
> > chrt -d -T 5000000 -P 16666666 -p 0 $mypid
> chrt: failed to set pid 560's policy: Operation not permitted
>
> // Offline CPU 3, so the rd now covers CPUs 0-2 only
> > echo 0 > /sys/devices/system/cpu/cpu3/online
> [ 400.843830] CPU3: shutdown
> [ 400.844100] psci: CPU3 killed (polled 0 ms)
>
> // Try to admit the task again, which now succeeds
> > chrt -d -T 5000000 -P 16666666 -p 0 $mypid
>
> // Plug CPU3 back online
> > echo 1 > /sys/devices/system/cpu/cpu3/online
> [ 408.819337] Detected PIPT I-cache on CPU3
> [ 408.819642] GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000
> [ 408.820165] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
>
> I don't see any easy way to fix this w/o iterating over all deadline
> tasks in the rd when hotplugging a CPU back on, and blocking the hotplug
> operation if it'll cause affinity issues. Urgh.
>

Yeah this looks like a plain existing bug, joy. :)

We fixed a few around AC lately, but I guess work wasn't complete.

Thanks,
Juri