Re: [PATCH v3 07/18] x86/intel_rdt: Add Haswell feature discovery

From: Luck, Tony
Date: Mon Oct 10 2016 - 14:55:50 EST


On Sun, Oct 09, 2016 at 06:28:23PM +0200, Borislav Petkov wrote:
> On Sun, Oct 09, 2016 at 10:09:37AM -0700, Fenghua Yu wrote:
> > The MSR is not guaranteed on every stepping of the family and model machine
> > because some parts may have the MSR fused off. And some bits in the MSR
> > may not be implemented on some parts. And in KVM or guest, the MSR may not
> > implemented. Those are reasons why we use wrmsr_safe/rdmsr_safe in Haswell
> > probe.
>
> Please add that info in a comment somewhere there as we'll all forget
> about it otherwise.

How about this (this diff on top of current series, but obviously we'll
fold it into part 07.


commit cdb05159fb91ed1f85c950c0f2c6de25f143961d
Author: Tony Luck <tony.luck@xxxxxxxxx>
Date: Mon Oct 10 11:48:42 2016 -0700

Update the HSW probe code - better comments, and use IA32_L3_CBM_BASE
as the probe MSR instead of PQR_ASSOC at suggestion of h/w architect).

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 4903e21d660d..e3c397306f1a 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -56,39 +56,39 @@ struct rdt_resource rdt_resources_all[] = {

/*
* cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
- * as it does not have CPUID enumeration support for Cache allocation.
+ * as they do not have CPUID enumeration support for Cache allocation.
+ * The check for Vendor/Family/Model is not enough to guarantee that
+ * the MSRs won't #GP fault because only the following SKUs support
+ * CAT:
+ * Intel(R) Xeon(R) CPU E5-2658 v3 @ 2.20GHz
+ * Intel(R) Xeon(R) CPU E5-2648L v3 @ 1.80GHz
+ * Intel(R) Xeon(R) CPU E5-2628L v3 @ 2.00GHz
+ * Intel(R) Xeon(R) CPU E5-2618L v3 @ 2.30GHz
+ * Intel(R) Xeon(R) CPU E5-2608L v3 @ 2.00GHz
*
- * Probes by writing to the high 32 bits(CLOSid) of the IA32_PQR_MSR and
- * testing if the bits stick. Max CLOSids is always 4 and max cbm length
+ * Probe by trying to write the first of the L3 cach mask registers
+ * and checking that the bits stick. Max CLOSids is always 4 and max cbm length
* is always 20 on hsw server parts. The minimum cache bitmask length
* allowed for HSW server is always 2 bits. Hardcode all of them.
*/
static inline bool cache_alloc_hsw_probe(void)
{
- u32 l, h_old, h_new, h_tmp;
+ u32 l, h;
struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3];

- if (rdmsr_safe(MSR_IA32_PQR_ASSOC, &l, &h_old))
- return false;
-
- /*
- * Default value is always 0 if feature is present.
- */
- h_tmp = h_old ^ 0x1U;
- if (wrmsr_safe(MSR_IA32_PQR_ASSOC, l, h_tmp))
- return false;
- rdmsr(MSR_IA32_PQR_ASSOC, l, h_new);
-
- if (h_tmp != h_new)
- return false;
-
- wrmsr(MSR_IA32_PQR_ASSOC, l, h_old);
-
r->max_closid = 4;
r->num_closid = r->max_closid;
r->cbm_len = 20;
r->max_cbm = BIT_MASK(20) - 1;
r->min_cbm_bits = 2;
+
+ if (wrmsr_safe(IA32_L3_CBM_BASE, r->max_cbm, 0))
+ return false;
+ rdmsr(IA32_L3_CBM_BASE, l, h);
+
+ if (l != r->max_cbm)
+ return false;
+
r->enabled = true;

return true;