Re: [PATCH v4 2/9] sched/topology: Extract "imb_numa_nr" calculation into a separate helper
From: Dietmar Eggemann
Date: Sun Mar 15 2026 - 20:19:14 EST
On 12.03.26 05:44, K Prateek Nayak wrote:
[...]
> +/*
> + * Calculate an allowed NUMA imbalance such that LLCs do not get
> + * imbalanced.
> + */
> +static void adjust_numa_imbalance(struct sched_domain *sd_llc)
> +{
> + struct sched_domain *parent;
> + unsigned int imb_span = 1;
> + unsigned int imb = 0;
> + unsigned int nr_llcs;
> +
> + WARN_ON(!(sd_llc->flags & SD_SHARE_LLC));
> + WARN_ON(!sd_llc->parent);
> +
> + /*
> + * For a single LLC per node, allow an
> + * imbalance up to 12.5% of the node. This is
> + * arbitrary cutoff based two factors -- SMT and
> + * memory channels. For SMT-2, the intent is to
> + * avoid premature sharing of HT resources but
> + * SMT-4 or SMT-8 *may* benefit from a different
> + * cutoff. For memory channels, this is a very
> + * rough estimate of how many channels may be
> + * active and is based on recent CPUs with
> + * many cores.
> + *
> + * For multiple LLCs, allow an imbalance
> + * until multiple tasks would share an LLC
> + * on one node while LLCs on another node
> + * remain idle. This assumes that there are
> + * enough logical CPUs per LLC to avoid SMT
> + * factors and that there is a correlation
> + * between LLCs and memory channels.
> + */
> + nr_llcs = sd_llc->parent->span_weight / sd_llc->span_weight;
> + if (nr_llcs == 1)
> + imb = sd_llc->parent->span_weight >> 3;
> + else
> + imb = nr_llcs;
> +
> + imb = max(1U, imb);
> + sd_llc->parent->imb_numa_nr = imb;
Here you set imb_numa_nr e.g. for PKG ...
> +
> + /*
> + * Set span based on the first NUMA domain.
> + *
> + * NUMA systems always add a NODE domain before
> + * iterating the NUMA domains. Since this is before
> + * degeneration, start from sd_llc's parent's
> + * parent which is the lowest an SD_NUMA domain can
> + * be relative to sd_llc.
> + */
> + parent = sd_llc->parent->parent;
> + while (parent && !(parent->flags & SD_NUMA))
> + parent = parent->parent;
> +
> + imb_span = parent ? parent->span_weight : sd_llc->parent->span_weight;
> +
> + /* Update the upper remainder of the topology */
> + parent = sd_llc->parent;
> + while (parent) {
> + int factor = max(1U, (parent->span_weight / imb_span));
> +
> + parent->imb_numa_nr = imb * factor;
... and here again.
Shouldn't we only set it for 'if (parent->flags & SD_NUMA)'?
Not sure if there are case in which PKG would persist in
... -> MC -> PKG -> NODE -> NUMA -> ... ?
Although access to sd->imb_numa_nr seems to be guarded by sd->flags &
SD_NUMA.
> + parent = parent->parent;
> + }
> +}
> +
[...]