Re: 2.6.34-rc2 - crash on shutdown

From: Clemens Ladisch
Date: Tue Mar 23 2010 - 09:52:46 EST


Stephane Eranian wrote:
> On Tue, Mar 23, 2010 at 1:02 PM, Clemens Ladisch <clemens@xxxxxxxxxx> wrote:
> > The only pointer access in this function is cpuhw->amd_nb, but
> > I don't see any obvious bugs.
>
> I reported a problem with the AMD initialization just last week.
> There is an issue with amd_pmu_cpu_online() which gets called
> too early, and thus fails. That leaves some bogus state and causes
> a crash in amd_pmu_cpu_offline().
>
> I proposed a fix which was rejected. The alternative involves moving
> some the of CPU initialization code (on AMD) to an earlier position,i.e.,
> which would be executed before the CPU_STARTED notifier. Nobody
> has proposed anything else so far.

I don't know about the early bootmem stuff, but regardless of this issue,
if amd_pmu_cpu_online() can fail, then amd_pmu_cpu_offline() must be able
to handle this without blowing up. Something like this (untested):

Signed-off-by: Clemens Ladisch <clemens@xxxxxxxxxx>

--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -324,17 +324,17 @@ static void amd_pmu_cpu_online(int cpu)
if (boot_cpu_data.x86_max_cores < 2)
return;

+ cpu1 = &per_cpu(cpu_hw_events, cpu);
+ cpu1->amd_nb = NULL;
+
/*
* function may be called too early in the
* boot process, in which case nb_id is bogus
*/
nb_id = amd_get_nb_id(cpu);
if (nb_id == BAD_APICID)
return;

- cpu1 = &per_cpu(cpu_hw_events, cpu);
- cpu1->amd_nb = NULL;
-
raw_spin_lock(&amd_nb_lock);

for_each_online_cpu(i) {
@@ -370,7 +370,7 @@ static void amd_pmu_cpu_offline(int cpu)

raw_spin_lock(&amd_nb_lock);

- if (--cpuhw->amd_nb->refcnt == 0)
+ if (cpuhw->amd_nb && --cpuhw->amd_nb->refcnt == 0)
kfree(cpuhw->amd_nb);

cpuhw->amd_nb = NULL;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/