Re: 2.6.26, PAT and AMD family 6

From: Rene Herman
Date: Wed May 07 2008 - 10:08:26 EST


On 07-05-08 15:42, Arjan van de Ven wrote:
On Wed, 07 May 2008 15:00:18 +0200
Rene Herman <rene.herman@xxxxxxxxxxxx> wrote:

On 07-05-08 04:39, Yinghai Lu wrote:

On Tue, May 6, 2008 at 6:48 PM, Rene Herman
<rene.herman@xxxxxxxxxxxx> wrote:
On 2.6.25 and below, my /proc/cpuinfo looks like:

processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 7
model name : AMD Duron(tm) Processor
[ ... ]

flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge
mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow ts
while on current mainline PAT and TS (Temperature Sensor) drop
from the feature flags:

flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge
mca cmov pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow

With respect to PAT, I guess it's
9307cacad0dfe3749f00303125c6f7f0523e5616, "x86: pat cpu feature
bit setting for known cpus" but what's this about?

Did my cpuinfo lie upto this point or shouldn't the flag be
cleared? The commit message for that change is completely and
totally unhelpful.
others like to to whitebox methods, ..., please try attach patch to
see if duron support PAT.
diff --git a/arch/x86/kernel/cpu/common.c
b/arch/x86/kernel/cpu/common.c index a428ffc..81483ec 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -314,6 +314,8 @@ static void __cpuinit early_get_cap(struct
cpuinfo_x86 *c) case X86_VENDOR_AMD:
if (c->x86 >= 0xf && c->x86 <= 0x11)
set_cpu_cap(c, X86_FEATURE_PAT);
+ if (c->x86 == 6 && c->x86_modes == 7)
+ set_cpu_cap(c, X86_FEATURE_PAT);
break;
case X86_VENDOR_INTEL:
if (c->x86 == 0xF || (c->x86 == 6 && c->x86_model
= 15))
s/modes/model/ but, as far as I'm aware, works fine other than that.
When I boot with CONFIG_X86_PAT after applying that, I see:

x86 PAT enabled: cpu 0, old 0x7040600070406, new
0x7010600070106

and PAT is retained in the feature flags. However, this I do not
consider very surprising. Why is this code doing what it is doing in
the first place?

These feature flags are read from hardware in the CPUID instruction.
Why is this code then going "ah, this CPU may _claim_ PAT but we
won't actually believe it unless it's model foo, bar or baz". Is that
feature flag buggy?


older cpus had various issues with PAT, some blatently obvious, some
more subtle.

And I suppose you have a list of these older CPUs or is this going to be
one of these things where in 5 years time I say to yet another person "ah
yes, I remember someone once telling me that old CPUs apparently had some
issues, some blatantly obvious, some more subtle" and the saga continues
on from there again?

Since for old systems the mtrrs clearly work fine... the idea was to
not take the risk (since there's no reward) and just leave them as
is, in a working state.

With CONFIG_X86_PAT, you now see "CPU and/or kernel does not support PAT."
at the top of your dmesg which is going to make people wonder. I did a
cat /proc/cpuinfo, saw no PAT flag and was just suspicious enough that I
didn't trust it.

A blacklist would be a better idea I feel, but in ANY case I think it's
really bad form to clear the feature flag. They are provided by hardware;
if arch/x86/mm/pat.c won't risk running except on a select few tested
models, whatever, but my /proc/cpuinfo shouldn't be lying to me.

Rene.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/