Re: [PATCH v5 0/7] Add latency priority for CFS class

From: K Prateek Nayak
Date: Tue Oct 25 2022 - 02:36:53 EST


Hello Vincent,

I've rerun some tests with a different configuration with more
contention for CPU and I can see a linear behavior. Sharing the
results below.

On 10/13/2022 8:54 PM, Vincent Guittot wrote:
>
> [..snip..]
>>
>> o Hackbench and Cyclictest in NPS1 configuration
>>
>> perf bench sched messaging -p -t -l 100000 -g 16&
>> cyclictest --policy other -D 5 -q -n -H 20000
>>
>> -----------------------------------------------------------------------------------------------------------------
>> |Hackbench | Cyclictest LN = 19 | Cyclictest LN = 0 | Cyclictest LN = -20 |
>> |LN |--------------------------------|---------------------------------|-----------------------------|
>> |v | Min | Avg | Max | Min | Avg | Max | Min | Avg | Max |
>> |--------------|--------|---------|-------------|----------|---------|------------|----------|---------|--------|
>> |0 | 54.00 | 117.00 | 3021.67 | 53.67 | 65.33 | 133.00 | 53.67 | 65.00 | 201.33 | ^
>> |19 | 50.00 | 100.67 | 3099.33 | 41.00 | 64.33 | 1014.33 | 54.00 | 63.67 | 213.33 |
>> |-20 | 53.00 | 169.00 | 11661.67 | 53.67 | 217.33 | 14313.67 | 46.00 | 61.33 | 236.00 | ^
>> -----------------------------------------------------------------------------------------------------------------
>
> The latency results look good with Cyclictest LN:0 and hackbench LN:0.
> 133us max latency. This suggests that your system is not overloaded
> and cyclictest doesn't really compete with others to run.

Following is the result of running cyclictest alongside hackbench with 32 groups:

perf bench sched messaging -p -l 100000 -g 32&
cyclictest --policy other -D 5 -q -n -H 20000

----------------------------------------------------------------------------------------------------------
| Hackbench | Cyclictest LN = 19 | Cyclictest LN = 0 | Cyclictest LN = -20 |
| LN |------------------------------|-------------------------------|---------------------------|
| | Min | Avg | Max | Min | Avg | Max | Min | Avg | Max |
|-------------|--------|---------|-----------|--------|---------|------------|--------|-------|----------|
| 0 | 54.00 | 165.00 | 6899.00 | 22.00 | 85.00 | 3294.00 | 23.00 | 64.00 | 276.00 |
| 19 | 53.00 | 173.00 | 3275.00 | 40.00 | 60.00 | 2276.00 | 13.00 | 59.00 | 94.00 |
| -20 | 52.00 | 293.00 | 19980.00 | 52.00 | 280.00 | 14305.00 | 53.00 | 95.00 | 5713.00 |
----------------------------------------------------------------------------------------------------------

I see a spike for Max in (0, 0) configuration and the latency decreases
monotonically with lower latency nice value.

>
>>
>> o Hackbench and schbench in NPS1 configuration
>>
>> perf bench sched messaging -p -t -l 1000000 -g 16&
>> schebcnh -m 1 -t 64 -s 30s
>>
>> ------------------------------------------------------------------------------------------------------------
>> |Hackbench | schbench LN = 19 | schbench LN = 0 | schbench LN = -20 |
>> |LN |----------------------------|--------------------------------|-----------------------------|
>> |v | 90th | 95th | 99th | 90th | 95th | 99th | 90th | 95th | 99th |
>> |--------------|--------|--------|----------|---------|---------|------------|---------|----------|--------|
>> |0 | 4264 | 6744 | 15664 | 17952 | 32672 | 55488 | 15088 | 25312 | 50112 |
>> |19 | 288 | 613 | 2332 | 274 | 1015 | 3628 | 374 | 1394 | 4424 |
>> |-20 | 35904 | 47680 | 79744 | 87168 | 113536 | 176896 | 13008 | 21216 | 42560 | ^
>> ------------------------------------------------------------------------------------------------------------
>
> For the schbench, your test is 30 seconds long which is longer than
> the duration of perf bench sched messaging -p -t -l 1000000 -g 16&
>
> The duration of the latter varies depending of latency nice value so
> schbench is disturb more time in some cases

I've rerun this with hackbench running 128 groups alongside schbench
with 2 messenger and 1 worker each. With larger worker count, I still
see non-monotonic behavior in 99th percentile latency of schbench.
I also see number of latency samples collected by schbench to vary
over the 30 second run for different latency nice values which could
also pay a part in seeing the unexpected behavior. For lower worker
count, I see the number of samples collected is similar. Following
is the configuration and the latency reported by schbench:

perf bench sched messaging -p -t -l 150000 -g 128&
schbench -m 2 -t 1 -s 30s

Note: In all cases, hackbench runs longer than schbench.

-------------------------------------------------------------------------------------------------
| Hackbench | schbench LN = 19 | schbench LN = 0 | schbench LN = -20 |
| LN |----------------------------|---------------------------|--------------------------|
| | 90th | 95th | 99th | 90th | 95th | 99th | 90th | 95th | 99th |
|-----------|--------|--------|----------|--------|--------|---------|--------|--------|--------|
| 0 | 42 | 92 | 2972 | 26 | 49 | 2356 | 9 | 11 | 20 |
| 19 | 35 | 424 | 4984 | 13 | 390 | 5096 | 8 | 10 | 14 | ^
| -19 | 144 | 3516 | 110208 | 61 | 807 | 34880 | 25 | 39 | 295 |
-------------------------------------------------------------------------------------------------

I see 90th and 95th percentile latency decrease monotonically with
latency nice value of schbench (for a fixed latency nice value of
hackbench) but there are cases where 99th percentile latency
reported by schbench may not strictly decrease with lower latency
nice value (Marked with ^)

Note: Only a small number of bad samples can affect the 99th
percentile latency for the above configuration. The monotonic
behavior in 90th and 95th percentile latency is a good data point
to show latency nice is indeed working as expected.

If there is any specific workload you would like me to run on the
test system, or any additional data you would like for above
workloads, please do let me know.

--
Thanks and Regards,
Prateek