Re: [PATCH] mm: simplify zone_idx()

From: Matthew Wilcox
Date: Tue Apr 15 2025 - 10:49:15 EST


On Tue, Apr 15, 2025 at 12:34:27PM +0000, gaoxu wrote:
> > Growing it doesn't feel like a big deal. Although "saves two assembly
> > instructions" is also not exactly a big win. If it saved a cacheline reference,
> > that might be more interesting, but it seems like it's more likely to introduce a
> > cacheline reference than save one. Maybe just not worth doing?
>
> Zone, zone_pgdat, and node_zones are all considered hot data; most of the time,
> they reside in the cache. In contrast, zone_idx in the patch is not hot data,
> and executing ((zone)->zone_idx) will add a new cache line.
> Am I understanding this correctly?

CPUs are not limited in the number of instructions they can execute today.
For example the ARM X4 can execute 8 instructions per clock. Most of
the time most of the CPU is idle, waiting on cache misses. A cache
miss (all the way to memory) is about 100ns, so if said CPU is clocked
at 3.4GHz, that's 2700 instructions that could be executed instead of
waiting for that cacheline. Other top-end CPUs have similar numbers;
you can find weak CPUs out there which have smaller ratios, but we don't
tend to optimise for low-end CPUs.

Therefore it is more important to avoid cacheline misses than it is to
reduce the number of instructions executed. You haven't measured any
improvement from your patch, so I think we should defaut to not
changing anything.