Re: [RFC][PATCH 2/8] perf, arch: Use early_initcall() for all archpmu implementations

From: Peter Zijlstra
Date: Thu Nov 25 2010 - 05:25:29 EST


On Wed, 2010-11-17 at 23:17 +0100, Peter Zijlstra wrote:
> plain text document attachment (perf-fix-hw-init.patch)
> Currently architectures use various random locations to init the PMU
> driver, for some this happens before the perf core code is
> initialized.
>
> In order to avoid calling perf_pmu_register() before the core code is
> up and running and able to deal with it, move all arch init to at
> least early_initcall (some archs use a later init, which is fine).
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> ---

<snip alpha,sparc bits>

> Index: linux-2.6/arch/x86/kernel/cpu/perf_event.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event.c
> +++ linux-2.6/arch/x86/kernel/cpu/perf_event.c
> @@ -1348,7 +1348,7 @@ static void __init pmu_check_apic(void)
> pr_info("no hardware sampling interrupt available.\n");
> }
>
> -void __init init_hw_perf_events(void)
> +int __init init_hw_perf_events(void)
> {
> struct event_constraint *c;
> int err;
> @@ -1363,11 +1363,11 @@ void __init init_hw_perf_events(void)
> err = amd_pmu_init();
> break;
> default:
> - return;
> + return 0;
> }
> if (err != 0) {
> pr_cont("no PMU driver, software events only.\n");
> - return;
> + return 0;
> }
>
> pmu_check_apic();
> @@ -1420,7 +1420,10 @@ void __init init_hw_perf_events(void)
>
> perf_pmu_register(&pmu);
> perf_cpu_notifier(x86_pmu_notifier);
> +
> + return 0;
> }
> +early_initcall(init_hw_perf_events);
>
> static inline void x86_pmu_read(struct perf_event *event)
> {

Right, so hw perf init happens from (after this patch):

arch_initcall: powerpc, arm, sh, mips
early_initcall: x86, sparc, alpha


Now the problem is that the generic watchdog code (kernel/watchdog.c)
tries to create hw perf events, and that too runs from early_initcall.

So my question is, how do we go about curing this, because powerpc, arm,
sh and mips are too late and the rest depends on link order to work, not
really a nice situation.

There's two categories of solutions:
- move the watchdog later, and
- move the hw perf init earlier.

The former is undesired because we want the watchdog as early as
possible, the later needs new infrastructure (also, I don't know if the
arch implementations can actually run this early).

So do I create a perf_initcall() or is there another solution that
avoids things like calling the watchdog code from all arch init code?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/