Re: [PATCH v6 04/46] percpu_rwlock: Implement the core design ofPer-CPU Reader-Writer Locks

From: Lai Jiangshan
Date: Mon Feb 25 2013 - 19:17:37 EST


On Tue, Feb 26, 2013 at 3:26 AM, Srivatsa S. Bhat
<srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote:
> Hi Lai,
>
> On 02/25/2013 09:23 PM, Lai Jiangshan wrote:
>> Hi, Srivatsa,
>>
>> The target of the whole patchset is nice for me.
>
> Cool! Thanks :-)
>
>> A question: How did you find out the such usages of
>> "preempt_disable()" and convert them? did all are converted?
>>
>
> Well, I scanned through the source tree for usages which implicitly
> disabled CPU offline and converted them over. Its not limited to uses
> of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable()
> etc also help disable CPU offline. So I tried to dig out all such uses
> and converted them. However, since the merge window is open, a lot of
> new code is flowing into the tree. So I'll have to rescan the tree to
> see if there are any more places to convert.
>
>> And I think the lock is too complex and reinvent the wheel, why don't
>> you reuse the lglock?
>
> lglocks? No way! ;-) See below...
>
>> I wrote an untested draft here.
>>
>> Thanks,
>> Lai
>>
>> PS: Some HA tools(I'm writing one) which takes checkpoints of
>> virtual-machines frequently, I guess this patchset can speedup the
>> tools.
>>
>> From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001
>> From: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
>> Date: Mon, 25 Feb 2013 23:14:27 +0800
>> Subject: [PATCH] lglock: add read-preference local-global rwlock
>>
>> locality via lglock(trylock)
>> read-preference read-write-lock via fallback rwlock_t
>>
>> Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
>> ---
>> include/linux/lglock.h | 31 +++++++++++++++++++++++++++++++
>> kernel/lglock.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 76 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/lglock.h b/include/linux/lglock.h
>> index 0d24e93..30fe887 100644
>> --- a/include/linux/lglock.h
>> +++ b/include/linux/lglock.h
>> @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu);
>> void lg_global_lock(struct lglock *lg);
>> void lg_global_unlock(struct lglock *lg);
>>
>> +struct lgrwlock {
>> + unsigned long __percpu *fallback_reader_refcnt;
>> + struct lglock lglock;
>> + rwlock_t fallback_rwlock;
>> +};
>> +
>> +#define DEFINE_LGRWLOCK(name) \
>> + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \
>> + = __ARCH_SPIN_LOCK_UNLOCKED; \
>> + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \
>> + struct lgrwlock name = { \
>> + .fallback_reader_refcnt = &name ## _refcnt, \
>> + .lglock = { .lock = &name ## _lock } }
>> +
>> +#define DEFINE_STATIC_LGRWLOCK(name) \
>> + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \
>> + = __ARCH_SPIN_LOCK_UNLOCKED; \
>> + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \
>> + static struct lgrwlock name = { \
>> + .fallback_reader_refcnt = &name ## _refcnt, \
>> + .lglock = { .lock = &name ## _lock } }
>> +
>> +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name)
>> +{
>> + lg_lock_init(&lgrw->lglock, name);
>> +}
>> +
>> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw);
>> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw);
>> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw);
>> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw);
>> #endif
>> diff --git a/kernel/lglock.c b/kernel/lglock.c
>> index 6535a66..463543a 100644
>> --- a/kernel/lglock.c
>> +++ b/kernel/lglock.c
>> @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg)
>> preempt_enable();
>> }
>> EXPORT_SYMBOL(lg_global_unlock);
>> +
>> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw)
>> +{
>> + struct lglock *lg = &lgrw->lglock;
>> +
>> + preempt_disable();
>> + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
>> + if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) {
>> + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_);
>> + return;
>> + }
>> + read_lock(&lgrw->fallback_rwlock);
>> + }
>> +
>> + __this_cpu_inc(*lgrw->fallback_reader_refcnt);
>> +}
>> +EXPORT_SYMBOL(lg_rwlock_local_read_lock);
>> +
>> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw)
>> +{
>> + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
>> + lg_local_unlock(&lgrw->lglock);
>> + return;
>> + }
>> +
>> + if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt))
>> + read_unlock(&lgrw->fallback_rwlock);
>> +
>> + preempt_enable();
>> +}
>> +EXPORT_SYMBOL(lg_rwlock_local_read_unlock);
>> +
>
> If I read the code above correctly, all you are doing is implementing a
> recursive reader-side primitive (ie., allowing the reader to call these
> functions recursively, without resulting in a self-deadlock).
>
> But the thing is, making the reader-side recursive is the least of our
> problems! Our main challenge is to make the locking extremely flexible
> and also safe-guard it against circular-locking-dependencies and deadlocks.
> Please take a look at the changelog of patch 1 - it explains the situation
> with an example.


My lock fixes your requirements(I read patch 1-6 before I sent). In
readsite, lglock 's lock is token via trylock, the lglock doesn't
contribute to deadlocks, we can consider it doesn't exist when we find
deadlock from it. And global fallback rwlock doesn't result to
deadlocks because it is read-preference(you need to inc the
fallback_reader_refcnt inside the cpu-hotplug write-side, I don't do
it in generic lgrwlock)


If lg_rwlock_local_read_lock() spins, which means
lg_rwlock_local_read_lock() spins on fallback_rwlock, and which means
lg_rwlock_global_write_lock() took the lgrwlock successfully and
return, and which means lg_rwlock_local_read_lock() will stop spinning
when the write side finished.


>
>> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw)
>> +{
>> + lg_global_lock(&lgrw->lglock);
>
> This does a for-loop on all CPUs and takes their locks one-by-one. That's
> exactly what we want to prevent, because that is the _source_ of all our
> deadlock woes in this case. In the presence of perfect lock ordering
> guarantees, this wouldn't have been a problem (that's why lglocks are
> being used successfully elsewhere in the kernel). In the stop-machine()
> removal case, the over-flexibility of preempt_disable() forces us to provide
> an equally flexible locking alternative. Hence we can't use such per-cpu
> locking schemes.
>
> You might note that, for exactly this reason, I haven't actually used any
> per-cpu _locks_ in this synchronization scheme, though it is named as
> "per-cpu rwlocks". The only per-cpu component here are the refcounts, and
> we consciously avoid waiting/spinning on them (because then that would be
> equivalent to having per-cpu locks, which are deadlock-prone). We use
> global rwlocks to get the deadlock-safety that we need.
>
>> + write_lock(&lgrw->fallback_rwlock);
>> +}
>> +EXPORT_SYMBOL(lg_rwlock_global_write_lock);
>> +
>> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw)
>> +{
>> + write_unlock(&lgrw->fallback_rwlock);
>> + lg_global_unlock(&lgrw->lglock);
>> +}
>> +EXPORT_SYMBOL(lg_rwlock_global_write_unlock);
>>
>
> Regards,
> Srivatsa S. Bhat
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/