Re: [RFC PATCH v3 2/3] sched: Introduce cpus_share_l2c

From: Mathieu Desnoyers
Date: Fri Aug 25 2023 - 09:51:05 EST

Next message: Andrew Lunn: "Re: [RFC net-next v2 3/5] net: phy: nxp-c45-tja11xx add MACsec support"
Previous message: Kirill A. Shutemov: "Re: [PATCH] x86/tdx: Mark TSC reliable"
In reply to: Aaron Lu: "Re: [RFC PATCH v3 2/3] sched: Introduce cpus_share_l2c"
Next in thread: Aaron Lu: "Re: [RFC PATCH v3 2/3] sched: Introduce cpus_share_l2c"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 8/25/23 02:49, Aaron Lu wrote:

On Thu, Aug 24, 2023 at 10:40:45AM -0400, Mathieu Desnoyers wrote:

[...]

- task migrations dropped with this series for nr_group=20 and 32
according to 'perf stat'. migration number didn't drop for nr_group=10
but the two update functions' cost dropped which means fewer access to
tg->load_avg and thus, fewer task migrations. This is contradictory
and I can not explain yet;

Neither can I.

[...]

It's not clear to me why this series can reduce task migrations. I doubt
it has something to do with more wakelist style wakeup becasue for this
test machine, only a single core with two SMT threads share L2 so more
wakeups are through wakelist. In wakelist style wakeup, the target rq's
ttwu_pending is set and that will make the target cpu as !idle_cpu();
This is faster than grabbing the target rq's lock and then increase
target rq's nr_running or set target rq's curr to something else than
idle. So wakelist style wakeup can make target cpu appear as non idle
faster, but I can't connect this with reduced migration yet, I just feel
this might be the reason why task migration reduced.

[...]

I've tried adding checks for rq->ttwu_pending in those code paths on top of
my patch and I'm still observing the reduction in number of migrations, so
it's unclear to me how doing more queued wakeups can reduce migrations the
way it does.

An interesting puzzle.

One metric that can help understand the impact of my patch: comparing
hackbench from a baseline where only your load_avg patch is applied
to a kernel with my l2c patch applied, I notice that the goidle
schedstat is cut in half. For a given CPU (they are pretty much alike),
it goes from 650456 to 353487.

So could it be that by doing queued wakeups, we end up batching
execution of the woken up tasks for a given CPU, rather than going
back and forth between idle and non-idle ? One important thing that
this changes is to reduce the number of newidle balance triggered.

Thoughts ?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Next message: Andrew Lunn: "Re: [RFC net-next v2 3/5] net: phy: nxp-c45-tja11xx add MACsec support"
Previous message: Kirill A. Shutemov: "Re: [PATCH] x86/tdx: Mark TSC reliable"
In reply to: Aaron Lu: "Re: [RFC PATCH v3 2/3] sched: Introduce cpus_share_l2c"
Next in thread: Aaron Lu: "Re: [RFC PATCH v3 2/3] sched: Introduce cpus_share_l2c"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]