Re: [External] : Re: [REGRESSION] sched/fair: Add lag based placement

From: Joseph Salisbury
Date: Thu Nov 14 2024 - 14:51:46 EST





On 11/13/24 13:33, Joseph Salisbury wrote:



On 11/13/24 13:22, Joseph Salisbury wrote:



On 11/13/24 13:19, Phil Auld wrote:
Hi,

On Wed, Nov 13, 2024 at 01:03:00PM -0500 Joseph Salisbury wrote:
Hello,

During performance testing, we found a regression of ~9% performance with
the TPCC benchmark.   This performance regression was introduced in
v6.6-rc1.  After a bisect, the following commit was identified as the cause
of the regression:

86bfbb7ce4f6 ("sched/fair: Add lag based placement")

I was hoping to get some feedback from the scheduler folks. Do you think
gathering any additional data will help diagnose this issue? Are there any
tunable options that can changed to see how performance is affected?

You can try turning off the PLACE_LAG sched feature:

     echo NO_PLACE_LAG > /sys/kernel/debug/sched/features

It's not what I'd call a tunable but it would allow you to test w/o it and
see what it does.  It should allow you to switch back and forth easily for
testing.


Cheers,
Phil
Thanks so much for the suggestion, Phil!  I will give that a try and report the results.
We can confirm that using NO_PLACE_LAG adds back 5% of the performance that was lost.  However, we have not yet measured what effect this will have on other benchmarks.

We will continue testing and can help test the patches that add PLACE_LAG and RUN_TO_PARITY as sysctl options.

Thanks,

Joe










I just noticed this thread, which is probably related:
https://urldefense.com/v3/__https://lore.kernel.org/lkml/ZxuujhhrJcoYOdMJ@xxxxxxxxxxxxxxxxxxxx/T/__;!!ACWV5N9M2RV99hQ!MhxYsyXTgwxk1HIWrxUHGSEZcJyBENlm5apMv2TEqf6Tn2uoi14-V8YSTymPDvjax78DSQR4m6zdQiJwxJ89K8iTmWl4hvUQ$