Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
From: Bernhard Held
Date: Sun May 28 2017 - 14:30:35 EST
Hi,
this patch breaks the boot of my kernel. The last message is "Booting
the kernel.".
My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a
Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the
microcode of the E5450 and recognizes the CPU.
Please find below the dmesg of a the latest kernel w/o the PAT-patch.
I'm happy to provide more information or to test patches.
Have fun,
Bernhard
[ 0.000000] Linux version 4.12.0-rc2-linus+ (berny@quad) (gcc version 6.3.1 20170202 [gcc-6-branch revision 245119] (SUSE Linux) ) #152 SMP PREEMPT Sun May 28 19:26:20 CEST 2017
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.12.0-rc2-linus+ root=/dev/mapper/VGMX300-root resume=/dev/sda2 showopts radeon.dpm=1 memmap=1$0xe4fd net.ifnames=0
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[ 0.000000] x86/fpu: Enabled xstate features 0x3, context size is 576 bytes, using 'standard' format.
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dbff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000cfedffff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000cfee0000-0x00000000cfee2fff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x00000000cfee3000-0x00000000cfeeffff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x00000000cfef0000-0x00000000cfefffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000d0000000-0x00000000dfffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x00000001afffffff] usable
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] e820: user-defined physical RAM map:
[ 0.000000] user: [mem 0x0000000000000000-0x000000000000e4fc] usable
[ 0.000000] user: [mem 0x000000000000e4fd-0x000000000000e4fd] reserved
[ 0.000000] user: [mem 0x000000000000e4fe-0x000000000009dbff] usable
[ 0.000000] user: [mem 0x000000000009f800-0x000000000009ffff] reserved
[ 0.000000] user: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[ 0.000000] user: [mem 0x0000000000100000-0x00000000cfedffff] usable
[ 0.000000] user: [mem 0x00000000cfee0000-0x00000000cfee2fff] ACPI NVS
[ 0.000000] user: [mem 0x00000000cfee3000-0x00000000cfeeffff] ACPI data
[ 0.000000] user: [mem 0x00000000cfef0000-0x00000000cfefffff] reserved
[ 0.000000] user: [mem 0x00000000d0000000-0x00000000dfffffff] reserved
[ 0.000000] user: [mem 0x00000000fec00000-0x00000000ffffffff] reserved
[ 0.000000] user: [mem 0x0000000100000000-0x00000001afffffff] usable
[ 0.000000] SMBIOS 2.4 present.
[ 0.000000] DMI: Gigabyte Technology Co., Ltd. G33-DS3R/G33-DS3R, BIOS F7L 07/31/2009
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[ 0.000000] e820: last_pfn = 0x1b0000 max_arch_pfn = 0x400000000
[ 0.000000] MTRR default type: uncachable
[ 0.000000] MTRR fixed ranges enabled:
[ 0.000000] 00000-9FFFF write-back
[ 0.000000] A0000-BFFFF uncachable
[ 0.000000] C0000-CDFFF write-protect
[ 0.000000] CE000-EFFFF uncachable
[ 0.000000] F0000-FFFFF write-through
[ 0.000000] MTRR variable ranges enabled:
[ 0.000000] 0 base 0000000000 mask 0F00000000 write-back
[ 0.000000] 1 base 00E0000000 mask 0FE0000000 uncachable
[ 0.000000] 2 base 00D0000000 mask 0FF0000000 uncachable
[ 0.000000] 3 base 0100000000 mask 0F00000000 write-back
[ 0.000000] 4 base 01C0000000 mask 0FC0000000 uncachable
[ 0.000000] 5 base 01B0000000 mask 0FF0000000 uncachable
[ 0.000000] 6 base 00CFF00000 mask 0FFFF00000 uncachable
[ 0.000000] 7 disabled
[ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT
[ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[ 0.000000] total RAM covered: 6143M
[ 0.000000] Found optimal setting for mtrr clean up
[ 0.000000] gran_size: 64K chunk_size: 2M num_reg: 7 lose cover RAM: 0G
[ 0.000000] e820: update [mem 0xcff00000-0xffffffff] usable ==> reserved
[ 0.000000] e820: last_pfn = 0xcfee0 max_arch_pfn = 0x400000000
[ 0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576
[ 0.000000] BRK [0x01e34000, 0x01e34fff] PGTABLE
[ 0.000000] BRK [0x01e35000, 0x01e35fff] PGTABLE
[ 0.000000] BRK [0x01e36000, 0x01e36fff] PGTABLE
[ 0.000000] BRK [0x01e37000, 0x01e37fff] PGTABLE
[ 0.000000] BRK [0x01e38000, 0x01e38fff] PGTABLE
[ 0.000000] BRK [0x01e39000, 0x01e39fff] PGTABLE
[ 0.000000] RAMDISK: [mem 0x36c5f000-0x37626fff]
On 05/24/2017 at 12:21 PM, tip-bot for Mikulas Patocka wrote:
Commit-ID: cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4
Gitweb: http://git.kernel.org/tip/cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4
Author: Mikulas Patocka <mpatocka@xxxxxxxxxx>
AuthorDate: Tue, 18 Apr 2017 15:07:11 -0400
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Wed, 24 May 2017 10:17:23 +0200
x86/PAT: Fix Xorg regression on CPUs that don't support PAT
In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.
However, on AMD K6-3 CPUs, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.
The result of this bug is that a kernel warning is produced when attempting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers:
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
...
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
To fix the bug change pat_enabled() so that it returns true only if PAT
initialization was actually done.
Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT
feature on the processor that is being initialized.
Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Brian Gerst <brgerst@xxxxxxxxx>
Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Luis R. Rodriguez <mcgrof@xxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Toshi Kani <toshi.kani@xxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v4.2+
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1704181501450.26399@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
arch/x86/mm/pat.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 9b78685..83a59a6 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -65,9 +65,11 @@ static int __init nopat(char *str)
}
early_param("nopat", nopat);
+static bool __read_mostly __pat_initialized = false;
+
bool pat_enabled(void)
{
- return !!__pat_enabled;
+ return __pat_initialized;
}
EXPORT_SYMBOL_GPL(pat_enabled);
@@ -225,13 +227,14 @@ static void pat_bsp_init(u64 pat)
}
wrmsrl(MSR_IA32_CR_PAT, pat);
+ __pat_initialized = true;
__init_cache_modes(pat);
}
static void pat_ap_init(u64 pat)
{
- if (!boot_cpu_has(X86_FEATURE_PAT)) {
+ if (!this_cpu_has(X86_FEATURE_PAT)) {
/*
* If this happens we are on a secondary CPU, but switched to
* PAT on the boot CPU. We have no way to undo PAT.
@@ -306,7 +309,7 @@ void pat_init(void)
u64 pat;
struct cpuinfo_x86 *c = &boot_cpu_data;
- if (!pat_enabled()) {
+ if (!__pat_enabled) {
init_cache_modes();
return;
}