Re: [RFC, PATCH] Add a CPU_STARTING notifier (was: Re: [PATCH,RFC] v4 scalable classic RCU implementation)

From: Paul E. McKenney
Date: Sun Sep 07 2008 - 15:47:34 EST


On Sun, Sep 07, 2008 at 12:18:18PM +0200, Manfred Spraul wrote:
> Paul E. McKenney wrote:
>> +/*
>> + * If the specified CPU is offline, tell the caller that it is in
>> + * a quiescent state. Otherwise, whack it with a reschedule IPI.
>> + * Grace periods can end up waiting on an offline CPU when that
>> + * CPU is in the process of coming online -- it will be added to the
>> + * rcu_node bitmasks before it actually makes it online. Because this
>> + * race is quite rare, we check for it after detecting that the grace
>> + * period has been delayed rather than checking each and every CPU
>> + * each and every time we start a new grace period.
>> + */
>> +static int rcu_implicit_offline_qs(struct rcu_data *rdp)
>> +{
>> + /*
>> + * If the CPU is offline, it is in a quiescent state. We can
>> + * trust its state not to change because interrupts are disabled.
>> + */
>> + if (cpu_is_offline(rdp->cpu)) {
>> + rdp->offline_fqs++;
>> + return 1;
>> + }
>>
> I fear that this won't work.
> E.g. look at x86, smp_callin() [arch/x86/kernel/smpboot.c]:
> The boot code must enable local interrupts around calibrate_delay(),
> otherwise the NMI watchdog would complain.
> That means the first interrupts, the first read side critical sections can
> run way before the cpu bit is set within cpu_online_map.
> cpus are just started, we are not within stop_machine. Thus we cannot make
> any assumption about what the remaining cpus are doing.

Ouch! Very good catch!!! ;-)

> What about introducing a CPU_STARTING notifier call, similar to CPU_DYING:
> - called with disabled interrupts
> - called before interrupts are enabled
> - must not sleep
> - called on the new cpu.

I suppose another approach would be to make calibrate_delay() kick the
watchdog timer every so often, but the effort explaining to people why
their machine's bogomips had decreased would probably far exceed any
possible benefit...

So given a CPU_STARTING notifier, RCU could set a bit in a coming-online
bitmap, which would be reset rcu_online_cpu(). Then RCU would consider
a CPU offline only if it is marked offline in cpu_online_map -and- it is
not marked as coming-online.

> This might also be useful for something like kvm. I'm not sure if it's
> guaranteed that hardware_enable() runs early enough.
>
> Attached is a patch proposal. Note that it doesn't work correctly: On
> x86-64, I have seen that CPU_STARTING is called after CPU_ONLINE. Thus
> frozen_cpus could already be cleared, then _FROZEN would be wrong.

Then notify_cpu_starting() is invoked right before smp_callin() in
start_secondary()?

Thanx, Paul

> --
> Manfred

> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index d7faf88..c2747ac 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -69,6 +69,7 @@ static inline void unregister_cpu_notifier(struct notifier_block *nb)
> #endif
>
> int cpu_up(unsigned int cpu);
> +void notify_cpu_starting(unsigned int cpu);
> extern void cpu_hotplug_init(void);
> extern void cpu_maps_update_begin(void);
> extern void cpu_maps_update_done(void);
> diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> index da2698b..8e47661 100644
> --- a/include/linux/notifier.h
> +++ b/include/linux/notifier.h
> @@ -213,9 +213,16 @@ static inline int notifier_to_errno(int ret)
> #define CPU_DOWN_FAILED 0x0006 /* CPU (unsigned)v NOT going down */
> #define CPU_DEAD 0x0007 /* CPU (unsigned)v dead */
> #define CPU_DYING 0x0008 /* CPU (unsigned)v not running any task,
> - * not handling interrupts, soon dead */
> + * not handling interrupts, soon dead.
> + * Called on the dying cpu, interrupts
> + * are already disabled. Must not
> + * sleep, must not fail */
> #define CPU_POST_DEAD 0x0009 /* CPU (unsigned)v dead, cpu_hotplug
> * lock is dropped */
> +#define CPU_STARTING 0x000A /* CPU (unsigned)v soon running.
> + * Called on the new cpu, just before
> + * enabling interrupts. Must not sleep,
> + * must not fail */
>
> /* Used for CPU hotplug events occuring while tasks are frozen due to a suspend
> * operation in progress
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 5b7c88f..2300fc0 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -455,6 +455,25 @@ out:
> }
> #endif /* CONFIG_PM_SLEEP_SMP */
>
> +/**
> + * notify_cpu_starting(cpu) - call the CPU_STARTING notifiers
> + * @cpu: cpu that just started
> + *
> + * This function calls the cpu_chain notifiers with CPU_STARTING.
> + * It must be called by the arch code on the new cpu, immediately
> + * before enabling interrupts.
> + */
> +void notify_cpu_starting(unsigned int cpu)
> +{
> + unsigned long val = CPU_STARTING;
> +
> +#ifdef CONFIG_PM_SLEEP_SMP
> + if (cpu_isset(cpu, frozen_cpus))
> + val = CPU_STARTING_FROZEN;
> +#endif /* CONFIG_PM_SLEEP_SMP */
> + raw_notifier_call_chain(&cpu_chain, val, (void*)(long)cpu);
> +}
> +
> #endif /* CONFIG_SMP */
>
> /*

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/