Re: [PATCH v6 00/90] x86: Introduce a centralized CPUID data model

From: Ahmed S. Darwish

Date: Mon Apr 27 2026 - 14:47:38 EST

Hi Boris,

On Mon, 13 Apr 2026, Ahmed S. Darwish wrote:
>
> On Tue, 31 Mar 2026, Borislav Petkov wrote:
> >
> > Well, since the goal is to have *all* CPUID leaves available to the kernel,
> > then we *technically* don't need the synthetic ones anymore with the exception
> > of a handful ones which we defined for ourselves, like X86_FEATURE_ALWAYS, for
> > example.
> >
> > But *all* synthetic bits which have correspondence to real CPUID leaves - and
> > they're synthetic because we wanted to save space... i.e., all those bits in
> > arch/x86/kernel/cpu/scattered.c, they don't need synthetic flags anymore
> > because the corresponding full leafs (damn spelling of Blätter eh!) are there.
> >
> > Then, I'm thinking, we can reorder all the remaining really-synthetic ones
> > into the unique 4-byte entries and then not even expose them in any db and not
> > make them available in anything because we will have to cast them in stone
> > then.
> >
>
> I guess that's a very sensible plan.
>
> The X86_FEATURE integration and synthetic bits work is covered by the late
> part of the queue (patches #77 => #90), so I'll rework 'em to do the above.
>

So I prototyped this.

Removing the scattered hardware-backed feature bits from within synthetic
X86_FEATURE words (i.e. the ones listed in scattered.c) required changing
the mapping tables at patch (79/90), "x86/cpuid: Introduce a compile-time
X86_FEATURE word map", from:

X86_FEATURE word => CPUID table word
to:

X86_FEATURE bit => CPUID table bit

That prototype actually did that and included converting all the capability
code at common.c: apply_forced_caps(), identify_cpu(), store_cpu_caps(),
microcode_check(), etc.

That part actually worked fine.

But then there were a lot of places with deeply ingrained logic that deals
with X86_FEATURE in terms of words. This includes KVM's reverse feature
maps, early 32-bit ASM code, the AWK and shell code generation build
scripts, early boot feature validation, feature naming tables, and others;
all relying on X86_FEATURE word access. [*] At that point, this started
looking like as a new phase of the CPUID work.

So, my suggestion would be:

* Keep the current word-granular X86_FEATURE compile-time mappings.

* Keep the current synthetic CPUID(0x4c780001) and CPUID(0x4c780002)
bitfield listings, with their hardware-backed scattered bits, as is.
Mark them as "v1" instead of setting them in stone.

* Work on getting this large patch queue merged, and in the background I
continue transitioning the remaining CPUID call sites to the CPUID API.

* Then, after all that is properly discussed and merged, I'll follow up
with a separate series that converts the X86_FEATURE mappins from
word-based to per-bit, together with the KVM, boot/asm, and build-time
machinery that currently assumes X86_FEATURE word access. At the end of
that series, the hardware-backed bits within the Linux synthetic words
can be removed; along with all the scattered.c code.

I understand that with the current design, querying a scattered X86_FEATURE
bit (say X86_FEATURE_APERFMPERF) might lead to a different result than
directly querying its backing CPUID leaf (CPUID(0x6).ECX[0]), but this is
already the state of the kernel as of today. So IMHO better have all this
new CPUID code merged and widely tested in a kernel release first, then
work on that next.

[*] git grep -E '(NCAPINTS|NBUGINTS|REQUIRED_MASK|cpuid\.flags)' arch/x86/

Thanks a lot!
Ahmed