On Fri, 6 Sep 2019, Johannes Erdfelt wrote:
On Fri, Sep 06, 2019, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
What your customers are asking for is a receipe for disaster. They can
check the safety of late loading forever, it will not magically become safe
because they do so.
If you want late loading, then the whole approach needs to be reworked from
ground up. You need to make sure that all CPUs are in a safe state,
i.e. where switching of CPU feature bits of all sorts can be done with the
guarantee that no CPU will return to the wrong code path after coming out
of safe state and that any kernel internal state which depends on the
previous set of CPU feature bits has been mopped up and switched over
before CPUs are released.
You say that switching of CPU feature bits is problematic, but adding
new features should result only in a warning ("x86/CPU: CPU features
have changed after loading microcode, but might not take effect.").
Removing a CPU feature bit could be problematic. Other than HLE being
removed on Haswell (which the kernel shouldn't use anyway), have there
been any other cases?
I ask because we have successfully used late microcode loading on tens
of thousands of hosts. I'm a bit worried to see that there is a push to
remove a feature that we currently rely on.
The point is that you know what's on stake so you can evaluate precisely
upfront whether that works or not and you have experienced kernel engineers
on staff who can tell you which kind of ucode change is going to explode in
your face and which on does not.
So it's the special case of a large cloud company with experts on staff.
Now map that to the average user/sysadmin. If we proliferate this, then the
inevitable consequence will be that those people read about how great that
is and how it made your customers happy yadayadayada. Now they go and do
the same thing and guess what happens? It explodes in their face, they send
bug reports and someone else will send lousy patches to paper over the
problem. None of this ends on your desk.
Yes you can surely argue that if you give people a gun then they can shoot
themself into their foot. But in that case it's a irresponsible argument
which just put's your interest above the general rule of not offering
things which are bound to break in all flavours of wreckage especially in
the hard to diagnose way.
So if we want to do late microcode loading in a sane way then there are
only a few options and none of them exist today:
1) Micro-code contains a description of CPUID bits which are going to be
exposed after the load. Then the kernel can sanity check whether this
changes anything relevant or not. If there is a relevant change it can
reject the load and tell the admin that a reboot is required.
2) Rework CPUID feature handling so that it can reevaluate and reconfigure
the running system safely. There are a lot of things you need for that:
A) Introduce a safe state for CPUs to reach which guarantees that none
of the CPUs will return from that state via a code path which
depends on previous state and might now go the other route with data
on the stack which only fits the previous configuration.
B) Make all the cpufeature thingies run time switchable. That means
that you need to keep quite some code around which is currently init
only. That also means that you have to provide backout code for
things which set up data corresponding to cpu feature bits and so
forth.
So #2 might be finished in about 20 years from now with the result that
some of the code pathes might simply still have a
if (cpufeature_changed())
panic();
because there are things which you cannot back out. So the only sane
solution is to panic. Which is not a solution as it would be much more sane
to prevent late loading upfront and force people to reboot proper.
Now #1 is actually a sensible and feasible solution which can be pulled off
in a reasonably short time frame, avoids all the bound to be ugly and
failure laden attempts of fixing late loading completely and provides a
usable and safe solution for joe user, jack admin and the super experts at
big-cloud corporate.
That is not requiring any new format of microcode payload, as this can be
nicely done as a metadata package which comes with the microcode
payload. So you get the following backwards compatible states:
Kernel metadata result
old don't care refuse late load
new No refuse late load
new Yes decide based on metadata
Thoughts?