Re: [git-pull -tip] x86: Basic AMD Support for performance counters

From: Ingo Molnar
Date: Sun Mar 01 2009 - 07:00:55 EST



* Ingo Molnar <mingo@xxxxxxx> wrote:

>
> * Jaswinder Singh Rajput <jaswinder@xxxxxxxxxx> wrote:
>
> > On Sun, 2009-03-01 at 12:30 +0100, Ingo Molnar wrote:
> > > * Jaswinder Singh Rajput <jaswinder@xxxxxxxxxx> wrote:
> > >
> > > > On Sun, 2009-03-01 at 09:36 +0100, Ingo Molnar wrote:
> > > > > * Ingo Molnar <mingo@xxxxxxx> wrote:
> > > > >
> > > > > > Seems to be working fine, here's the output from an Athlon 64
> > > > > > 3200+ (Sempron) box:
> > > > > >
> > > > > > Performance counter stats for 'ls':
> > > > > >
> > > > > > 17.420811 task clock ticks (msecs)
> > > > > >
> > > > > > 0 CPU migrations (events)
> > > > > > 12 context switches (events)
> > > > > > 583 pagefaults (events)
> > > > > > 29760299 CPU cycles (events)
> > > > > > 29401642 instructions (events)
> > > > > > 12698498 cache references (events)
> > > > > > 66269 cache misses (events)
> > > > > >
> > > > > > Wall-clock time elapsed: 687.999988 msecs
> > > > >
> > > > > The patches cause a crash on another system - an Opteron system
> > > > > spontaneous reboots at this point during early bootup:
> > > > >
> > > > > CPU 0/0x4 -> Node 0
> > > > > tseg: 00cfe00000
> > > > > CPU: Physical Processor ID: 0
> > > > > CPU: Processor Core ID: 0
> > > > > using C1E aware idle routine
> > > > > AMD Performance Monitoring support detected.
> > > > > ... num counters: 4
> > > > > ... value mask: 0000000000000000
> > > > > ... fixed counters: 0
> > > > > ... counter mask: 000000000000000f
> > > > > ACPI: Core revision 20081204
> > > > > ftrace: converting mcount calls to 0f 1f 44 00 00
> > > > > ftrace: allocating 16365 entries in 129 pages
> > > > > Setting APIC routing to physical flat
> > > > > masked ExtINT on CPU#0
> > > > > ENABLING IO
> > > > > [reboot]
> > > > >
> > > >
> > > > Can you please share your config file.
> > >
> > > any config file will crash that box. I used the 64-bit defconfig
> > > - i.e. 'make ARCH=x86_64 defconfig'.
> > >
> >
> > Can you please try this patch:
> >
> > diff --git a/arch/x86/kernel/cpu/perf_counter.c b/arch/x86/kernel/cpu/perf_counter.c
> > index 266618a..5447cc0 100644
> > --- a/arch/x86/kernel/cpu/perf_counter.c
> > +++ b/arch/x86/kernel/cpu/perf_counter.c
> > @@ -146,7 +146,9 @@ static int __hw_perf_counter_init(struct perf_counter *counter)
> > * Generate PMC IRQs:
> > * (keep 'enabled' bit clear for now)
> > */
> > - hwc->config = ARCH_PERFMON_EVENTSEL_INT;
> > + /* Currently Interrupts are disabled on AMD */
> > + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> > + hwc->config = ARCH_PERFMON_EVENTSEL_INT;
>
> still crashes in a similar way.
>
> hm, this box has nmi_watchdog=2, and the NMI watchdog uses the
> PMU too - will disable that.

yep, nmi_watchdog=0 solves the regression. You ought to be able
to reproduce the same problem by adding nmi_watchdog=2 on your
testbox.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/