Re: [PATCH RFC] Microcode late loading feature identification
From: Raj, Ashok
Date: Mon May 11 2020 - 11:23:32 EST
Hi Mihai
Thanks for an attempt to find a fix. To solve this for real there
are several other factors to consider and I'm afraid its not as simple
as you have articulated here. There are lots of practical limitations that
prevent us from solving this completely. But we haven't given up :-)
In order to be successful, this needs to be factored in by the vendor either
as part of the development process or somehow generated automatically. Both
have pitfalls, and as you read below some of it might become clear.
On Mon, May 11, 2020 at 05:11:23PM +0300, Mihai Carabas wrote:
> La 27.04.2020 10:27, Mihai Carabas a scris:
> >This RFC patch set aims to provide a way to identify the modifications
> >brought in by the new microcode updated at runtime (aka microcode late
> >loading). This was debated last year and this patch set implements
> >point #1 from Thomas Gleixner's idea:
> >https://lore.kernel.org/lkml/alpine.DEB.2.21.1909062237580.1902@xxxxxxxxxxxxxxxxxxxxxxx/
> >
>
> +Ashok and Thomas to get a feedback from vendor side on file
> format/integration in the microcode blob and signature.
To understand the complications of microcode there are a few things to consider.
We have been working on this internally, and here is why its difficult to
simplify it as +msr/-msr, +cpuid etc. Yes these are things that possibly
controlled by microcode, but microcode has several parts, not just CPU microcode.
The revision you see is just one big running number. There are other parts of the
microcode that you don't see its internal version. In addition some parts of the
microcode are effective only when deployed by FIT. That cpu picks up right after
reset. During late load, even though you see the version updated, it doesn't
mean all the internal versions of ucode are latched and effective.
In addition, there are differences how some mitigations are deployed, as you know
some have MSR's that OS can find, there are others that need BIOS/early boot to
make the mitigations effective. In order to get a full picture of weather a microcode
file is late-loadable you need to know a lot more about how we got here to this
version loaded on the CPU.
>
> Thank you,
> Mihai
>
> >This patch set has the following patches:
> >
> >- patch 1 is introducing a new metadata file that comes with the microcode
> >(provided by the CPU manufacture) that describes what modifications are
> >done by loading the new microcode
> >
> >- patch 2 parses the metadata file and is verifying it against kernel
> >policy. In this patch, as an RFC, as a kernel policy, it was imposed
> >the rule of not allowing to remove any feature. If so, it won't be
> >loaded a new microcode. The policy can be further extended and describe
> >in different ways
Haven't read the individual patches yet. but you would need every interim
patch metadata to be always available.
Since if you move from patch x->y you can find that an msr was removed.
But say you go from patch x->z, but there was no msr removed in patch y-z.
You need to process and collate all the msr/cpuid to comprehend what changing
between x->z since that also removes the msr for e.g.
To add more complications CPU has fuses and each patch content can be effective
on some but not all SKU's. So the exact same microcode can have different behavior
depending on which SKU its loaded into. So generating the common meta-data file
is also not a trivial process.
Cheers,
Ashok