Re: [External] : Re: [REGRESSION][v6.17-rc1]sched/fair: Bump sd->max_newidle_lb_cost when newidle balance fails
From: Joseph Salisbury
Date: Tue Oct 07 2025 - 16:23:25 EST
On 10/6/25 17:23, Chris Mason wrote:
On 10/6/25 4:23 PM, Joseph Salisbury wrote:Hi Chris,
Hi Chris,Hi everyone,
During testing, we are seeing a ~6% performance regression with the
upstream stable v6.12.43 kernel (And Oracle UEK
6.12.0-104.43.4.el9uek.x86_64 kernel) when running the Phoronix
pts/apache benchmark with 100 concurrent requests [0]. The regression
is seen with the following hardware:
PROCESSOR: Intel Xeon Platinum 8167M Core Count: 8 Thread Count: 16
Extensions: SSE 4.2 + AVX512CD + AVX2 + AVX + RDRAND + FSGSBASE Cache
Size: 16 MB Microcode: 0x1 Core Family: Cascade Lake
After performing a bisect, we found that the performance regression was
introduced by the following commit:
Stable v6.12.43: fc4289233e4b ("sched/fair: Bump sd->max_newidle_lb_cost
when newidle balance fails")
Mainline v6.17-rc1: 155213a2aed4 ("sched/fair: Bump
sd->max_newidle_lb_cost when newidle balance fails")
Reverting this commit causes the performance regression to not exist.
I was hoping to get your feedback, since you are the patch author. Do
you think gathering any additional data will help diagnose this issue?
Peter, we've had a collection of regression reports based on this
change, so it sounds like we need to make it less aggressive, or maybe
we need to make the degrading of the cost number more aggressive?
Joe (and everyone else who has hit this), can I talk you into trying the
drgn from
https://urldefense.com/v3/__https://lore.kernel.org/lkml/2fbf24bc-e895-40de-9ff6-5c18b74b4300@xxxxxxxx/__;!!ACWV5N9M2RV99hQ!Pm-G5L97VLtQdDRHad16cdOnEwHxyKGHd8U1FSLtAY-oy2pNcbmCjTS1XRjq-ypIoQdJGkE_12KkAr0$
I'm curious if it degrades at all or just gets stuck up high.
-chris
Thanks for the quick response!
Yes, I will try out the drgn from the link you posted and provide feedback.
Thanks,
Joe