Re: Usecases for the per-task latency-nice attribute
From: Tim Chen
Date: Wed Sep 18 2019 - 13:16:32 EST
On 9/18/19 5:41 AM, Parth Shah wrote:
> Hello everyone,
>
> As per the discussion in LPC2019, new per-task property like latency-nice
> can be useful in certain scenarios. The scheduler can take proper decision
> by knowing latency requirement of a task from the end-user itself.
>
> There has already been an effort from Subhra for introducing Task
> latency-nice [1] values and have seen several possibilities where this type of
> interface can be used.
>
> From the best of my understanding of the discussion on the mail thread and
> in the LPC2019, it seems that there are two dilemmas;
Thanks for starting the discussion.
>
> -------------------
> **Usecases**
> -------------------
>
> $> TurboSched
> ====================
> TurboSched [2] tries to minimize the number of active cores in a socket by
> packing an un-important and low-utilization (named jitter) task on an
> already active core and thus refrains from waking up of a new core if
> possible. This requires tagging of tasks from the userspace hinting which
> tasks are un-important and thus waking-up a new core to minimize the
> latency is un-necessary for such tasks.
> As per the discussion on the posted RFC, it will be appropriate to use the
> task latency property where a task with the highest latency-nice value can
> be packed.
> But for this specific use-cases, having just a binary value to know which
> task is latency-sensitive and which not is sufficient enough, but having a
> range is also a good way to go where above some threshold the task can be
> packed.
>
>
$> Separating AVX512 tasks and latency sensitive tasks on separate cores
-------------------------------------------------------------------------
Another usecase we are considering is to segregate those workload that will pull down
core cpu frequency (e.g. AVX512) from workload that are latency sensitive.
There are certain tasks that need to provide a fast response time (latency sensitive)
and they are best scheduled on cpu that has a lighter load and not have other
tasks running on the sibling cpu that could pull down the cpu core frequency.
Some users are running machine learning batch tasks with AVX512, and have observed
that these tasks affect the tasks needing a fast response. They have to
rely on manual CPU affinity to separate these tasks. With appropriate
latency hint on task, the scheduler can be taught to separate them.
Tim