Re: [RFC PATCH v3 1/9] CPU hotplug: Provide APIs to prevent CPUoffline from atomic context
From: Oleg Nesterov
Date: Mon Dec 10 2012 - 12:24:05 EST
On 12/10, Srivatsa S. Bhat wrote:
>
> On 12/10/2012 01:52 AM, Oleg Nesterov wrote:
> > On 12/10, Srivatsa S. Bhat wrote:
> >>
> >> On 12/10/2012 12:44 AM, Oleg Nesterov wrote:
> >>
> >>> But yes, it is easy to blame somebody else's code ;) And I can't suggest
> >>> something better at least right now. If I understand correctly, we can not
> >>> use, say, synchronize_sched() in _cpu_down() path
> >>
> >> We can't sleep in that code.. so that's a no-go.
> >
> > But we can?
> >
> > Note that I meant _cpu_down(), not get_online_cpus_atomic() or take_cpu_down().
> >
>
> Maybe I'm missing something, but how would it help if we did a
> synchronize_sched() so early (in _cpu_down())? Another bunch of preempt_disable()
> sections could start immediately after our call to synchronize_sched() no?
> How would we deal with that?
Sorry for confusion. Of course synchronize_sched() alone is not enough.
But we can use it to synchronize with preempt-disabled section and avoid
the barriers/atomic in the fast-path.
For example,
bool writer_pending;
DEFINE_RWLOCK(writer_rwlock);
DEFINE_PER_CPU(int, reader_ctr);
void get_online_cpus_atomic(void)
{
preempt_disable();
if (likely(!writer_pending) || __this_cpu_read(reader_ctr)) {
__this_cpu_inc(reader_ctr);
return;
}
read_lock(&writer_rwlock);
__this_cpu_inc(reader_ctr);
read_unlock(&writer_rwlock);
}
// lacks release semantics, but we don't care
void put_online_cpus_atomic(void)
{
__this_cpu_dec(reader_ctr);
preempt_enable();
}
Now, _cpu_down() does
writer_pending = true;
synchronize_sched();
before stop_one_cpu(). When synchronize_sched() returns, we know that
every get_online_cpus_atomic() must see writer_pending == T. And, if
any CPU incremented its reader_ctr we must see it is not zero.
take_cpu_down() does
write_lock(&writer_rwlock);
for_each_online_cpu(cpu) {
while (per_cpu(reader_ctr, cpu))
cpu_relax();
}
and takes the lock.
However. This can lead to the deadlock we already discussed. So
take_cpu_down() should do
retry:
write_lock(&writer_rwlock);
for_each_online_cpu(cpu) {
if (per_cpu(reader_ctr, cpu)) {
write_unlock(&writer_rwlock);
goto retry;
}
}
to take the lock. But this is livelockable. However, I do not think it
is possible to avoid the livelock.
Just in case, the code above is only for illustration, perhaps it is not
100% correct and perhaps we can do it better. cpu_hotplug.active_writer
is ignored for simplicity, get/put should check current == active_writer.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/