Re: [PATCH 1/3] x86/resctrl: Get max rmid and occupancy scale directly from CPUID instead of cpuinfo_x86

From: Thomas Gleixner
Date: Sun Jun 16 2019 - 16:29:42 EST


On Sun, 16 Jun 2019, Fenghua Yu wrote:

> Although x86_cache_max_rmid and x86_cache_occ_scale are only read once
> during resctrl initialization, they are always stored in cpuinfo_x86 for
> each CPU during run time without usage. Even if resctrl is not
> configured, they still occupy space in cpuinfo_x86.

And that's a problem because?

> To save cpuinfo_x86 space and make CPU and resctrl initialization simpler,
> remove the two fields from cpuinfo_x86 and get max rmid and occupancy

What is simpler? The fact that more code fiddles with CPUID? That's exactly
the wrong direction.

The storage size of struct cpuinfo is largely uninteresting especially as
long as we keep num_online_cpus copies of the same information around.

Just grep for places which invoke CPUID and then look how many of them do
it over and over and even the code in arch/x86/kernel/cpu/ is an
unpenetrable mess for exactly this reason.

The right thing to do is to have one instance of the CPUID information
which is

- a proper data structure with named fields and named bits

- a single master instance which can be mapped to all CPUs

This data structure is filled in in one go by reading out all leaves and
not by the maze we have today which puts together selected parts piecewise
and never exposes a full and consistent picture.

This allows to remove all these custom copies of CPUID leaf readouts and
allows proper filtering/disabling at a central place.

Making it a proper data structure with fields and bits gets rid of all that
hex masking/shifting nonsense which is used to decode parts of those
fields.

That's not a performance issue because all performance critical code should
use static_cpu_has() anyway. For non critical code boot_cpu_has() is
sufficient.

Upcoming secondary CPUs would do a sanity check on their CPUID content to
check whether everything is symmetric. We should do that actually today
because not detecting asymetric features early leads to exactly the issue
which was fixed recently with loading the micro code earlier than perf.
But we can't because the information retrieval is done in a gazillion of
places.

Now you might argue that the upcoming asymetric processors (SIGH!) will
require per CPU instances of the feature leafs. Sure that needs some
thought, but it needs thought even with the current code and I'm absolutely
not interested to duct tape that stuff into the current code.

The solution for this with the above scheme is to utilize the feature
mismatch detection and have a proper classification which features are
allowed to deviate and which are not. For those which can deviate, we can
provide separate storage as this information needs to be propagated to
other entities (fault handlers, placement code, xsave variants etc.). But
that's a limited amount of information and the bulk will still be the same.

This mismatch detection is essential for dealing with future asymetric
CPUs proper. When the kernel detects it, it can disable mismatching
features which are not yet handled by code which has to be aware of them.

And disabling them means that with that scheme we can actually trap CPUID
in userspace and provide it consistent and filtered information instead
of having the mismatch between kernel view and user space view.

Borislav has experimented with that already, but thanks to the marvelous
security features built into Intel (and other) CPUs this is still mostly a
drawing board exercise.

Just for the record. Before this cleanup takes place, I'm not even looking
at any patches which attempt to support asymetric processors. The current
supply of duct tape engineering horrors is sufficient for bad mood. No need
for more of that.

Thanks,

tglx