Re: [PATCH] X86: don't report PAT on CPUs that don't support it

From: Mikulas Patocka
Date: Sun May 14 2017 - 18:08:04 EST




On Tue, 18 Apr 2017, Mikulas Patocka wrote:

>
>
> On Tue, 18 Apr 2017, H. Peter Anvin wrote:
>
> > On 04/18/17 12:07, Mikulas Patocka wrote:
> > > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
> > > variable is set to 1 by default and the function pat_init() sets
> > > __pat_enabled to 0 if the CPU doesn't support PAT.
> > >
> > > However, on AMD K6-3 CPU, the processor initialization code never calls
> > > pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> > > returns true, even though the K6-3 CPU doesn't support PAT.
> > >
> > > The result of this bug is that this warning is produced when attemting to
> > > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
> > > Another symptom of this bug is that the framebuffer driver doesn't set the
> > > K6-3 MTRR registers.
> > >
> > > This patch changes pat_enabled() so that it returns true only if pat
> > > initialization was actually done.
> > >
> > > Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
> > > this_cpu_has(X86_FEATURE_PAT) in pat_ap_init, so that we check the PAT
> > > feature on the processor that is being initialized.
> > >
> >
> > I'm thinking it would be better to replace __pat_enabled with
> > static_cpu_has(X86_FEATURE_PAT) and disable the feature bit if the user
> > has disabled it on the command line, just as we do with other features.
> >
> > -hpa

What's the status of this bug report? Will you accept the patch that I
sent or do you want a different patch?

If you want a special X86 capability for PAT, you can use the patch below.
Note, that we can't use X86_FEATURE_PAT for this (because if CPU supports
PAT, but pat_init() is not called, pat_enabled() would incorrectly report
true), so we need a different capability X86_FEATURE_PAT_ENABLED.

Mikulas

> If MTRR initialization fails for whatever reason, then pat_init() won't be
> called and the kernel would mistakenly believe that PAT is working
> (because there would be no one to clear X86_FEATURE_PAT).
>
> I think that pat should be reported only if pat_init() is called and
> succeeds.
>
>
> Another strange thing: pat_disable() calls init_cache_modes() - but since
> pat_disable() may not be called at all, it is possible that
> init_cache_modes() is also not called at all. It doesn't produce any
> visible misbehavior on my machine, but it doesn't seem right - we should
> not call init_cache_modes() from pat_disable() and do the initialization
> elsewhere, where it is always called.
>
> Mikulas

X86: don't report PAT on CPUs that don't support it

In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.

However, on AMD K6-3 CPU, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.

The result of this bug is that this warning is produced when attemting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers.

This patch introduces a new CPU capability X86_FEATURE_PAT_ENABLED (which
is set when pat_init() is called and succeeds) and changes pat_enabled()
to return this capability. Note that we can use X86_FEATURE_PAT for this
purpose, because if pat_init() is not called, there would be no one to
clear X86_FEATURE_PAT and the kernel would mistakenly behave as if PAT is
working.

x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix
CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35
Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
Call Trace:
? __warn+0xab/0xc0
? untrack_pfn+0x5c/0x9f
? warn_slowpath_null+0xf/0x13
? untrack_pfn+0x5c/0x9f
? unmap_single_vma+0x43/0x66
? unmap_vmas+0x24/0x30
? exit_mmap+0x3c/0xa5
? __mmput+0xf/0x76
? copy_process.part.76+0xb43/0x1129
? _do_fork+0x96/0x177
? do_int80_syscall_32+0x3e/0x4c
? entry_INT80_32+0x2a/0x2a
---[ end trace 8726f9d9fa90d702 ]---
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining

Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v4.2+

---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/pat.h | 3 ++-
arch/x86/mm/pat.c | 12 ++++--------
3 files changed, 7 insertions(+), 9 deletions(-)

Index: linux-2.6/arch/x86/mm/pat.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/pat.c
+++ linux-2.6/arch/x86/mm/pat.c
@@ -64,12 +64,6 @@ static int __init nopat(char *str)
}
early_param("nopat", nopat);

-bool pat_enabled(void)
-{
- return !!__pat_enabled;
-}
-EXPORT_SYMBOL_GPL(pat_enabled);
-
int pat_debug_enable;

static int __init pat_debug_setup(char *str)
@@ -225,12 +219,14 @@ static void pat_bsp_init(u64 pat)

wrmsrl(MSR_IA32_CR_PAT, pat);

+ setup_force_cpu_cap(X86_FEATURE_PAT_ENABLED);
+
__init_cache_modes(pat);
}

static void pat_ap_init(u64 pat)
{
- if (!boot_cpu_has(X86_FEATURE_PAT)) {
+ if (!this_cpu_has(X86_FEATURE_PAT)) {
/*
* If this happens we are on a secondary CPU, but switched to
* PAT on the boot CPU. We have no way to undo PAT.
@@ -305,7 +301,7 @@ void pat_init(void)
u64 pat;
struct cpuinfo_x86 *c = &boot_cpu_data;

- if (!pat_enabled()) {
+ if (!__pat_enabled) {
init_cache_modes();
return;
}
Index: linux-2.6/arch/x86/include/asm/pat.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/pat.h
+++ linux-2.6/arch/x86/include/asm/pat.h
@@ -4,7 +4,8 @@
#include <linux/types.h>
#include <asm/pgtable_types.h>

-bool pat_enabled(void);
+#define pat_enabled() static_cpu_has(X86_FEATURE_PAT_ENABLED)
+
void pat_disable(const char *reason);
extern void pat_init(void);

Index: linux-2.6/arch/x86/include/asm/cpufeatures.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/cpufeatures.h
+++ linux-2.6/arch/x86/include/asm/cpufeatures.h
@@ -104,6 +104,7 @@
#define X86_FEATURE_EXTD_APICID ( 3*32+26) /* has extended APICID (8 bits) */
#define X86_FEATURE_AMD_DCM ( 3*32+27) /* multi-node processor */
#define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */
+#define X86_FEATURE_PAT_ENABLED ( 3*32+29) /* PAT enabled */
#define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */
#define X86_FEATURE_TSC_KNOWN_FREQ ( 3*32+31) /* TSC has known frequency */