Re: [PATCH v3 15/21] sched/cache: Disable cache aware scheduling for processes with high thread counts

Next message: Steve French: "Re: [PATCH 11/15] smb: client: Drop 'allocate_crypto' arg from smb*_calc_signature()"
Previous message: Eric Biggers: "[PATCH 15/15] wifi: mac80211: Use AES-CMAC library in aes_s2v()"
In reply to: Madadi Vineeth Reddy: "Re: [PATCH v3 15/21] sched/cache: Disable cache aware scheduling for processes with high thread counts"
Next in thread: Madadi Vineeth Reddy: "Re: [PATCH v3 15/21] sched/cache: Disable cache aware scheduling for processes with high thread counts"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Tim Chen

Date: Wed Feb 18 2026 - 16:44:45 EST

On Wed, 2026-02-18 at 23:24 +0530, Madadi Vineeth Reddy wrote:
> On 11/02/26 03:48, Tim Chen wrote:
> > From: Chen Yu <yu.c.chen@xxxxxxxxx>
> >
> >
[ .. snip ..]

> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index d1145997b88d..86b6b08e7e1e 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -1223,6 +1223,19 @@ static inline bool valid_llc_buf(struct sched_domain *sd,
> > return valid_llc_id(id);
> > }
> >
> > +static bool exceed_llc_nr(struct mm_struct *mm, int cpu)
> > +{
> > + int smt_nr = 1;
> > +
> > +#ifdef CONFIG_SCHED_SMT
> > + if (sched_smt_active())
> > + smt_nr = cpumask_weight(cpu_smt_mask(cpu));
> > +#endif
> > +
> > + return !fits_capacity((mm->sc_stat.nr_running_avg * smt_nr),
> > + per_cpu(sd_llc_size, cpu));
>
>
> On Power10/Power11 with SMT4 and LLC size of 4, this check
> effectively disables cache-aware scheduling for any process.

There are 4 cores per LLC, with 4 SMT per core? In that case, once we have more than
4 running threads and there's another idle LLC available, seems
like putting the additional thread on a different LLC is the
right thing to do as threads sharing a core will usually be much
slower.

But when number of threads are under 4, we should still be
doing aggregation.

Perhaps I am misunderstanding your topology.

Tim

>
> I raised this point in v1 as well. Increasing the threshold
> doesn't seem like a viable solution either, as that would regress
> hackbench/ebizzy.
>
> Is there a way to make this useful for architectures with small LLC
> sizes? One possible approach we were exploring is to have LLC at a
> hemisphere level that comprise multiple SMT4 cores.
>
> Thanks,
> Vineeth