Re: BITS handling of CPU microcode updates
From: Burt Triplett
Date: Mon Mar 21 2011 - 19:46:29 EST
Sorry for the delayed response; I wanted to make sure I could give you
an answer that would agree with the Intel Software Developer's Manual,
and that ended up meaning I needed to start the process of updating the
relevant part of the SDM. :)
On 3/11/2011 6:30 PM, Henrique de Moraes Holschuh wrote:
The BITS handling of Intel CPU microcode updates does not match either the
official documentation I cold find, or the Linux kernel code.
The documentation clearly states that microcode revision levels are a
*signed* 32-bit number, and that implies it should be subject to signed
comparisons.
The Linux kernel code considers it an unsigned 32-bit number, and does an
unsigned comparison (this is likely a bug).
And the BITS code mentions something called a "BWG", and will always install
a microcode with a revision< 0, it will never install a microcode with a
revision of zero, and does a normal version comparison if the revision is
greater than zero.
AFAIK, just like extended signatures (which are yet to be seen in the wild),
microcoes with negative or zero revision levels have never been published to
operating system vendors, so those discrepancies have had, so far, no impact
in the field. But they could well be latent bugs.
Can you please clarify what is the correct behaviour ?
The Intel SDM correctly identifies microcode revision numbers as signed.
However, a simple signed comparison doesn't actually capture the
correct logic, nor does an unsigned comparison, though in both cases the
problem doesn't tend to come up in common cases.
Negative microcode revision numbers only appear on microcodes used in
test environments for debugging purposes.
For the following explanation, let X = the version of the microcode
currently in the CPU, and let Z = the version of the microcode you
potentially want to load depending on the revision check.
If you have a microcode with Z < 0, the user knows what they're doing
and they want to load that microcode regardless of revision. Always
load such a microcode regardless of X (it doesn't matter if X < Z or X > Z).
If you have a microcode with Z > 0, and X > 0 as well (the CPU has a
production microcode in it), then load the microcode only if newer (Z >
X). Since microcodes with negative revisions only appear in test lab
environments (as you noted, they don't appear in the wild), and since
with positive revisions this behavior matches either a simple signed or
unsigned comparison, the subtly wrong results haven't appeared outside
of test lab environments. :)
The interesting case comes up when X < 0 and Z > 0: the CPU already has
a microcode loaded with a negative revision, and you have a production
microcode you might want to load. In this case, the correct behavior
differs based on whether the microcode loader runs automatically (such
as the tools that load microcode at Linux boot time), or acts with
*explicit* user action (such as BITS, the BIOS int 15 handler, or the
Linux tools *if* they can distinguish the case where the user explicitly
ran them from the case of running automatically without user intervention).
Tools which run automatically, without explicit user action, should not
attempt to load a microcode if (X < 0) and (Z > 0). Doing so makes life
very difficult for people in those test lab environments: they put a
microcode they want to test in the BIOS or load it via BITS, but then
the OS driver automatically overrides it with the latest production
microcode. So, tools which run automatically without explicit user
action should follow this rule:
if ((Z < 0) || (Z > 0 && X > 0 && Z > X)) load_microcode();
Tools which load microcode in response to *explicit* user actions should
override a negative-revision microcode with a positive-revision
microcode. Such tools should follow this rule:
if ((Z < 0) || (Z > 0 && Z > X)) load_microcode();
Currently, as far as I know, the Linux microcode driver and userspace
tools do not distinguish these two cases: the logic invoked
automatically at boot time matches the logic run if the user invokes
microcode_ctl explicitly. Given that, the revision logic in the kernel
or microcode_ctl should match the "tools which run automatically,
without explicit user action" case above:
if ((Z < 0) || (Z > 0 && X > 0 && Z > X)) load_microcode();
If microcode_ctl added an option to distinguish these two cases, it
could apply the alternative logic when explicitly requested.
Note that you should *not* see production systems shipping with
negative-revision microcodes.
I've started the process of getting the SDM changed to document the
logic described above, and distinguish the two different types of
microcode-loading tools.
Hope that helps,
Burt Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/