Re: [RFC 1/4] sched/core: Introduce per_cpu counter to track latency sensitive tasks

From: Parth Shah
Date: Fri May 08 2020 - 07:31:10 EST




On 5/8/20 2:10 PM, Pavan Kondeti wrote:
> On Thu, May 07, 2020 at 07:07:20PM +0530, Parth Shah wrote:
>> The "nr_lat_sensitive" per_cpu variable provides hints on the possible
>> number of latency-sensitive tasks occupying the CPU. This hints further
>> helps in inhibiting the CPUIDLE governor from calling deeper IDLE states
>> (next patches includes this).
>>
>
> Can you please explain the intended use case here? Once a latency sensitive
> task is created, it prevents c-state on a CPU whether the task runs again
> or not in the near future.
>
> I assume, either these latency sensitive tasks won't be around for long time
> or applications set/reset latency sensitive nice value dynamically.
>

Intended use-cases is to get rid of IDLE states exit_latency for
wakeup-sleep-wakeup pattern workload. This types of tasks (like GPU
workloads, few DB benchmarks) makes CPU go IDLE due to its low runtime on
rq, resulting in higher wakeups due to IDLE states exit_latency.

And this kind of workloads may last for long time as well.

In current scenario, Sysadmins do disable all IDLE states or use PM_QoS to
not have latency penalty on workload. This model was good when core counts
were less. But now higher core count and Turbo frequencies have led to save
power in-order to get higher performance and hence this patch-set tries to
do PM_QoS like thing but at per-task granularity.

If idea seems good to go, then this can potentially be extended to do IDLE
gating upto certain level where latency_nice value hints on which IDLE
states can't be chosen, just like PM_QoS have cpu_dma_latency constraints.


Thanks,
Parth


> Thanks,
> Pavan
>