Re: [GIT PULL] perf fixes

From: Linus Torvalds
Date: Thu Mar 14 2013 - 21:06:45 EST


On Thu, Mar 14, 2013 at 5:24 PM, Stephane Eranian <eranian@xxxxxxxxxx> wrote:
>
> I bet if you force the affinity of your perf record to be on
> a CPU other than CPU0, you will not get the crash.
>
> This is what I am seeing now. I appears on resume,
> CPU0 hotplug callbacks for perf_events are not invoked
> leaving DS_AREA MSR to 0.
>
> Can you confirm on your machine?

I'm not even going to bother confirming it, because I think you're
right, and I think the reason is clear: the DS initialization code
uses the CPU_UP notifiers.

And that's sufficient for CPU hotplug, which is what suspend/resume
ends up doing for all but the boot CPU. But the boot CPU is not
hotplugged.

Using CPU_UP notifiers is wrong, and they get called too late anyway.

The code should use a real resume method. Or, better yet, just do it
right, and do it from __restore_processor_state().

Those f*cking CPU notifiers are a pain in the ass, and the tend to be
invariably broken, and they have their own idiotic hacks that are
equally broken (ie that x86_pmu_notifier() thing seems to make up its
own suspend/resume with
"x86_pmu.cpu_prepare/cpu_starting/cpu_dying/cpu_dead" things.

I guess we could make the BP do a fake cpu notifier thing around the
suspend of the boot processor as well, but most of the per-CPU stuff
seems to be perfectly fine without it (ie mtrr, apic, etc etc all use
the suspend/resume infrastructure) and doesn't need that kind of
stuff.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/