Re: [tip:timers/core] [posix] 1535cb8028: stress-ng.epoll.ops_per_sec 36.2% regression

From: Thomas Gleixner
Date: Wed Mar 26 2025 - 17:43:54 EST


On Wed, Mar 26 2025 at 22:11, Mateusz Guzik wrote:
> On Wed, Mar 26, 2025 at 09:07:51AM +0100, Thomas Gleixner wrote:
>> How on earth can this commit result in both a 36% regression and a 25%
>> improvement with the same test?
>>
>> Unfortunately I can't reproduce any of it. I checked the epoll test
>> source and it uses a posix timer, but that commit makes the hash less
>> contended so there is zero explanation.
>>
>
> The short summary is:
> 1. your change is fine
> 2. stress-ng is doing seriously weird stuff here resulting in the above
> 3. there may or may not be something the scheduler can do to help
>
> for the regression stats are saying:
> feb864ee99a2d8a2 1535cb80286e6fbc834f075039f
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 5.97 ± 56% +35.8 41.74 ± 24% mpstat.cpu.all.idle%
> 0.86 ± 3% -0.3 0.51 ± 11% mpstat.cpu.all.irq%
> 0.10 ± 3% +2.0 2.11 ± 13% mpstat.cpu.all.soft%
> 92.01 ± 3% -37.7 54.27 ± 18% mpstat.cpu.all.sys%
> 1.06 ± 3% +0.3 1.37 ± 8% mpstat.cpu.all.usr%
> 27.83 ± 38% -84.4% 4.33 ± 31% mpstat.max_utilization.seconds
>
> As in system time went down and idle went up.
>
> Your patch must have a side effect where it messes with some of the
> timings between workers.

It does as it removes the global lock and the potential contention on
it.

> The testcase is doing a lot of weird stuff, including calling yield()
> for every loop iteration. On top of that if the other worker does not
> win the race there is also a sleep of 0.1s thrown in. I commented these
> suckers out and weird anomalies persisted.
>
> All that said, I'm not going to further look into it. Was curious wtf
> though hence the write up.

Thak you for taking the time and looking into this. The analysis of this
"benchmark" is a fun read and I agree that it matches my impression of
looking into the source of this thing that it does weird stuff, which
does not make any sense at all.

Thanks,

tglx