Re: [PATCH] sched/deadline: Unthrottle PI boosted threads while enqueuing
From: Juri Lelli
Date: Fri Sep 18 2020 - 02:00:37 EST
Hi Daniel,
On 16/09/20 09:06, Daniel Bristot de Oliveira wrote:
> stress-ng has a test (stress-ng --cyclic) that creates a set of threads
> under SCHED_DEADLINE with the following parameters:
>
> dl_runtime = 10000 (10 us)
> dl_deadline = 100000 (100 us)
> dl_period = 100000 (100 us)
>
> These parameters are very aggressive. When using a system without HRTICK
> set, these threads can easily execute longer than the dl_runtime because
> the throttling happens with 1/HZ resolution.
>
> During the main part of the test, the system works just fine because
> the workload does not try to run over the 10 us. The problem happens at
> the end of the test, on the exit() path. During exit(), the threads need
> to do some cleanups that require real-time mutex locks, mainly those
> related to memory management, resulting in this scenario:
>
> Note: locks are rt_mutexes...
> ------------------------------------------------------------------------
> TASK A: TASK B: TASK C:
> activation
> activation
> activation
>
> lock(a): OK! lock(b): OK!
> <overrun runtime>
> lock(a)
> -> block (task A owns it)
> -> self notice/set throttled
> +--< -> arm replenished timer
> | switch-out
> | lock(b)
> | -> <C prio > B prio>
> | -> boost TASK B
> | unlock(a) switch-out
> | -> handle lock a to B
> | -> wakeup(B)
> | -> B is throttled:
> | -> do not enqueue
> | switch-out
> |
> |
> +---------------------> replenishment timer
> -> TASK B is boosted:
> -> do not enqueue
> ------------------------------------------------------------------------
>
> BOOM: TASK B is runnable but !enqueued, holding TASK C: the system
> crashes with hung task C.
>
> This problem is avoided by removing the throttle state from the boosted
> thread while boosting it (by TASK A in the example above), allowing it to
> be queued and run boosted.
>
> The next replenishment will take care of the runtime overrun, pushing
> the deadline further away. See the "while (dl_se->runtime <= 0)" on
> replenish_dl_entity() for more information.
>
> Signed-off-by: Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>
> Reported-by: Mark Simmons <msimmons@xxxxxxxxxx>
> Reviewed-by: Juri Lelli <juri.lelli@xxxxxxxxxx>
> Tested-by: Mark Simmons <msimmons@xxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
> Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> Cc: Ben Segall <bsegall@xxxxxxxxxx>
> Cc: Mel Gorman <mgorman@xxxxxxx>
> Cc: Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx
>
> ---
Thanks for this fix.
Acked-by: Juri Lelli <juri.lelli@xxxxxxxxxx>
Best,
Juri