Re: [PATCH] x86/PAT: have pat_enabled() properly reflect state when running on e.g. Xen
From: Jan Beulich
Date: Tue Jul 12 2022 - 02:04:55 EST
On 11.07.2022 19:41, Chuck Zmudzinski wrote:
> Moreover... (please move to the bottom of the code snippet
> for more information about my tests in the Xen PV environment...)
>
> void init_cache_modes(void)
> {
> u64 pat = 0;
>
> if (pat_cm_initialized)
> return;
>
> if (boot_cpu_has(X86_FEATURE_PAT)) {
> /*
> * CPU supports PAT. Set PAT table to be consistent with
> * PAT MSR. This case supports "nopat" boot option, and
> * virtual machine environments which support PAT without
> * MTRRs. In specific, Xen has unique setup to PAT MSR.
> *
> * If PAT MSR returns 0, it is considered invalid and emulates
> * as No PAT.
> */
> rdmsrl(MSR_IA32_CR_PAT, pat);
> }
>
> if (!pat) {
> /*
> * No PAT. Emulate the PAT table that corresponds to the two
> * cache bits, PWT (Write Through) and PCD (Cache Disable).
> * This setup is also the same as the BIOS default setup.
> *
> * PTE encoding:
> *
> * PCD
> * |PWT PAT
> * || slot
> * 00 0 WB : _PAGE_CACHE_MODE_WB
> * 01 1 WT : _PAGE_CACHE_MODE_WT
> * 10 2 UC-: _PAGE_CACHE_MODE_UC_MINUS
> * 11 3 UC : _PAGE_CACHE_MODE_UC
> *
> * NOTE: When WC or WP is used, it is redirected to UC- per
> * the default setup in __cachemode2pte_tbl[].
> */
> pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
> PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
> }
>
> else if (!pat_bp_enabled) {
> /*
> * In some environments, specifically Xen PV, PAT
> * initialization is skipped because MTRRs are
> * disabled even though PAT is available. In such
> * environments, set PAT to initialized and enabled to
> * correctly indicate to callers of pat_enabled() that
> * PAT is available and prevent PAT from being disabled.
> */
> pat_bp_enabled = true;
> pr_info("x86/PAT: PAT enabled by init_cache_modes\n");
> }
>
> __init_cache_modes(pat);
> }
>
> This function, patched with the extra 'else if' block, fixes the
> regression on my Xen worksatation, and the pr_info message
> "x86/PAT: PAT enabled by init_cache_modes" appears in the logs
> when running this patched kernel in my Xen Dom0. This means
> that in the Xen PV environment on my Xen Dom0 workstation,
> rdmsrl(MSR_IA32_CR_PAT, pat) successfully tested for the presence
> of PAT on the virtual CPU that Xen exposed to the Linux kernel on my
> Xen Dom0 workstation. At least that is what I think my tests prove.
>
> So why is this not a valid way to test for the existence of
> PAT in the Xen PV environment? Are the existing comments
> in init_cache_modes() about supporting both the case when
> the "nopat" boot option is set and the specific case of Xen and
> MTRR disabled wrong? My testing confirms those comments are
> correct.
At the very least this ignores the possible "nopat" an admin may
have passed to the kernel.
Jan