On Fri, Jul 15, 2022 at 04:25:49PM +0200, Juergen Gross wrote:
Today PAT is usable only with MTRR being active, with some nasty tweaks
to make PAT usable when running as Xen PV guest, which doesn't support
MTRR.
The reason for this coupling is, that both, PAT MSR changes and MTRR
changes, require a similar sequence and so full PAT support was added
using the already available MTRR handling.
Xen PV PAT handling can work without MTRR, as it just needs to consume
the PAT MSR setting done by the hypervisor without the ability and need
to change it. This in turn has resulted in a convoluted initialization
sequence and wrong decisions regarding cache mode availability due to
misguiding PAT availability flags.
Fix all of that by allowing to use PAT without MTRR and by adding an
environment dependent PAT init function.
Aha, there's the explanation I was looking for.
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0a1bd14f7966..3edfb779dab5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2408,8 +2408,8 @@ void __init cache_bp_init(void)
{
if (IS_ENABLED(CONFIG_MTRR))
mtrr_bp_init();
- else
- pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
+
+ pat_cpu_init();
}
void cache_ap_init(void)
@@ -2417,7 +2417,8 @@ void cache_ap_init(void)
if (cache_aps_delayed_init)
return;
- mtrr_ap_init();
+ if (!mtrr_ap_init())
+ pat_ap_init_nomtrr();
}
So I'm reading this as: if it couldn't init AP's MTRRs, init its PAT.
But currently, the code sets the MTRRs for the delayed case or when the
CPU is not online by doing ->set_all and in there it sets first MTRRs
and then PAT.
I think the code above should simply try the two things, one after the
other, independently from one another.
And I see you've added another stomp machine call for PAT only.
Now, what I think the design of all this should be, is:
you have a bunch of things you need to do at each point:
* cache_ap_init
* cache_aps_init
* ...
Now, in each those, you look at whether PAT or MTRR is supported and you
do only those which are supported.
Also, the rendezvous handler should do:
if MTRR:
do MTRR specific stuff
if PAT:
do PAT specific stuff
This way you have clean definitions of what needs to happen when and you
also do *only* the things that the platform supports, by keeping the
proper order of operations - I believe MTRRs first and then PAT.
This way we'll get rid of that crazy maze of who calls what and when.
But first we need to define those points where stuff needs to happen and
then for each point define what stuff needs to happen.
How does that sound?
Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature