Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPUoffline from atomic context

From: Oleg Nesterov
Date: Sun Dec 09 2012 - 14:14:45 EST


On 12/07, Srivatsa S. Bhat wrote:
>
> Per-cpu counters can help solve the cache-line bouncing problem. So we
> actually use the best of both: per-cpu counters (no-waiting) at the reader
> side in the fast-path, and global rwlocks in the slowpath.
>
> [ Fastpath = no writer is active; Slowpath = a writer is active ]
>
> IOW, the hotplug readers just increment/decrement their per-cpu refcounts
> when no writer is active.

Plus LOCK and cli/sti. I do not pretend I really know how bad this is
performance-wise though. And at first glance this look overcomplicated.

But yes, it is easy to blame somebody else's code ;) And I can't suggest
something better at least right now. If I understand correctly, we can not
use, say, synchronize_sched() in _cpu_down() path, you also want to improve
the latency. And I guess something like kick_all_cpus_sync() is "too heavy".

Also. After the quick reading this doesn't look correct, please see below.

> +void get_online_cpus_atomic(void)
> +{
> + unsigned int cpu = smp_processor_id();
> + unsigned long flags;
> +
> + preempt_disable();
> + local_irq_save(flags);
> +
> + if (cpu_hotplug.active_writer == current)
> + goto out;
> +
> + smp_rmb(); /* Paired with smp_wmb() in drop_writer_signal() */
> +
> + if (likely(!writer_active(cpu))) {

WINDOW. Suppose that reader_active() == F.

> + mark_reader_fastpath();
> + goto out;

Why take_cpu_down() can't do announce_cpu_offline_begin() + sync_all_readers()
in between?

Looks like we should increment the counter first, then check writer_active().
And sync_atomic_reader() needs rmb between 2 atomic_read's.


Or. Again, suppose that reader_active() == F. But is_new_writer() == T.

> + if (is_new_writer(cpu)) {
> + /*
> + * ACK the writer's signal only if this is a fresh read-side
> + * critical section, and not just an extension of a running
> + * (nested) read-side critical section.
> + */
> + if (!reader_active(cpu)) {
> + ack_writer_signal();

What if take_cpu_down() does announce_cpu_offline_end() right before
ack_writer_signal() ? In this case get_online_cpus_atomic() returns
with writer_signal == -1. If nothing else this breaks the next
raise_writer_signal().

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/