Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

From: Peter Zijlstra
Date: Wed Apr 23 2014 - 10:45:53 EST


On Wed, Apr 23, 2014 at 04:31:32PM +0200, Stephane Eranian wrote:
> On Thu, Mar 13, 2014 at 8:36 PM, Venkatesh Srinivas
> <venkateshs@xxxxxxxxxx> wrote:
> > CPUs which should support the RAPL counters according to
> > Family/Model/Stepping may still issue #GP when attempting to access
> > the RAPL MSRs. This may happen when Linux is running under KVM and
> > we are passing-through host F/M/S data, for example. Use rdmsrl_safe
> > to first access the RAPL_POWER_UNIT MSR; if this fails, do not
> > attempt to use this PMU.
> >
> > Signed-off-by: Venkatesh Srinivas <venkateshs@xxxxxxxxxx>
> > ---
> > arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +++++++++---
> > 1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
> > index 5ad35ad..95700e5 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
> > @@ -511,6 +511,7 @@ static int rapl_cpu_prepare(int cpu)
> > struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
> > int phys_id = topology_physical_package_id(cpu);
> > u64 ms;
> > + u64 msr_rapl_power_unit_bits;
> >
> > if (pmu)
> > return 0;
> > @@ -518,6 +519,9 @@ static int rapl_cpu_prepare(int cpu)
> > if (phys_id < 0)
> > return -1;
> >
> > + if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, &msr_rapl_power_unit_bits))
> > + return -1;
> > +
> I have a problem with this patch on native hardware. This
> rdmsrl_safe() systematically
> fails when I know the MSR is perfectly valid on the CPU. Consequently, RAPL PMU
> is disabled when I tried on IvyBridge and Haswell CPUs.
>
> I don't know the internals of rdmsrl_safe(). Maybe it is invoked too
> early in the boot process.

Weird; so the way it works is that it adds an exception table entry for
the wrmsr instruction, so when the wrmsr generates a fault due to being
an invalid msr the fault handler looks at the exception table, and finds
the entry, which instructs it to continue execution at the error path
and return -EFAULT.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/