Re: [PATCH 02/10] cacheinfo: calculate per-CPU data cache size

From: Mel Gorman
Date: Thu Oct 12 2023 - 08:53:05 EST


On Thu, Oct 12, 2023 at 08:08:32PM +0800, Huang, Ying wrote:
> Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> writes:
>
> > On Wed, Sep 20, 2023 at 02:18:48PM +0800, Huang Ying wrote:
> >> Per-CPU data cache size is useful information. For example, it can be
> >> used to determine per-CPU cache size. So, in this patch, the data
> >> cache size for each CPU is calculated via data_cache_size /
> >> shared_cpu_weight.
> >>
> >> A brute-force algorithm to iterate all online CPUs is used to avoid
> >> to allocate an extra cpumask, especially in offline callback.
> >>
> >> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
> >
> > It's not necessarily relevant to the patch, but at least the scheduler
> > also stores some per-cpu topology information such as sd_llc_size -- the
> > number of CPUs sharing the same last-level-cache as this CPU. It may be
> > worth unifying this at some point if it's common that per-cpu
> > information is too fine and per-zone or per-node information is too
> > coarse. This would be particularly true when considering locking
> > granularity,
> >
> >> Cc: Sudeep Holla <sudeep.holla@xxxxxxx>
> >> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> >> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> >> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> >> Cc: David Hildenbrand <david@xxxxxxxxxx>
> >> Cc: Johannes Weiner <jweiner@xxxxxxxxxx>
> >> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> >> Cc: Michal Hocko <mhocko@xxxxxxxx>
> >> Cc: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
> >> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> >> Cc: Christoph Lameter <cl@xxxxxxxxx>
> >> ---
> >> drivers/base/cacheinfo.c | 42 ++++++++++++++++++++++++++++++++++++++-
> >> include/linux/cacheinfo.h | 1 +
> >> 2 files changed, 42 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> >> index cbae8be1fe52..3e8951a3fbab 100644
> >> --- a/drivers/base/cacheinfo.c
> >> +++ b/drivers/base/cacheinfo.c
> >> @@ -898,6 +898,41 @@ static int cache_add_dev(unsigned int cpu)
> >> return rc;
> >> }
> >>
> >> +static void update_data_cache_size_cpu(unsigned int cpu)
> >> +{
> >> + struct cpu_cacheinfo *ci;
> >> + struct cacheinfo *leaf;
> >> + unsigned int i, nr_shared;
> >> + unsigned int size_data = 0;
> >> +
> >> + if (!per_cpu_cacheinfo(cpu))
> >> + return;
> >> +
> >> + ci = ci_cacheinfo(cpu);
> >> + for (i = 0; i < cache_leaves(cpu); i++) {
> >> + leaf = per_cpu_cacheinfo_idx(cpu, i);
> >> + if (leaf->type != CACHE_TYPE_DATA &&
> >> + leaf->type != CACHE_TYPE_UNIFIED)
> >> + continue;
> >> + nr_shared = cpumask_weight(&leaf->shared_cpu_map);
> >> + if (!nr_shared)
> >> + continue;
> >> + size_data += leaf->size / nr_shared;
> >> + }
> >> + ci->size_data = size_data;
> >> +}
> >
> > This needs comments.
> >
> > It would be nice to add a comment on top describing the limitation of
> > CACHE_TYPE_UNIFIED here in the context of
> > update_data_cache_size_cpu().
>
> Sure. Will do that.
>

Thanks.

> > The L2 cache could be unified but much smaller than a L3 or other
> > last-level-cache. It's not clear from the code what level of cache is being
> > used due to a lack of familiarity of the cpu_cacheinfo code but size_data
> > is not the size of a cache, it appears to be the share of a cache a CPU
> > would have under ideal circumstances.
>
> Yes. And it isn't for one specific level of cache. It's sum of per-CPU
> shares of all levels of cache. But the calculation is inaccurate. More
> details are in the below reply.
>
> > However, as it appears to also be
> > iterating hierarchy then this may not be accurate. Caches may or may not
> > allow data to be duplicated between levels so the value may be inaccurate.
>
> Thank you very much for pointing this out! The cache can be inclusive
> or not. So, we cannot calculate the per-CPU slice of all-level caches
> via adding them together blindly. I will change this in a follow-on
> patch.
>

Please do, I would strongly suggest basing this on LLC only because it's
the only value you can be sure of. This change is the only change that may
warrant a respin of the series as the history will be somewhat confusing
otherwise.

--
Mel Gorman
SUSE Labs