Re: linux-next: manual merge of the tip tree

From: NeilBrown
Date: Mon Oct 21 2013 - 22:10:26 EST


On Thu, 17 Oct 2013 11:23:49 +0200 Peter Zijlstra <peterz@xxxxxxxxxxxxx>
wrote:

> On Thu, Oct 17, 2013 at 12:28:59PM +1100, NeilBrown wrote:
> > I always run with lockdep enabled, and I have done at least basic testing
>
> Very good!
>
> > >
> > > Stuff like:
> > >
> > > + for (i = 0; i < NR_STRIPE_HASH_LOCKS; i++)
> > > + spin_lock_init(conf->hash_locks + i);
> > >
> > > And:
> > >
> > > +static void __lock_all_hash_locks(struct r5conf *conf)
> > > +{
> > > + int i;
> > > + for (i = 0; i < NR_STRIPE_HASH_LOCKS; i++)
> > > + spin_lock(conf->hash_locks + i);
> > > +}
> > >
> > > Tends to complain real loud.
> >
> > Why is that?
> > Because "conf->hash_locks + i" gets used as the "name" of the lockdep map for
> > each one, and when they are all locked it looks like nested locking??
>
> Exactly so; they all share the same class (and name) because they have
> the same init site; so indeed the multiple lock will look like a nested
> lock.
>
> > Do you have a suggestion for how to make this work?
> > Would
> > spin_lock_nested(conf->hash_locks + i, i)
> > do the trick?
>
> spin_lock_nest_lock(conf->hash_locks + i, &conf->device_lock);

Unfortunately this doesn't work as the order is backwards.
hash_lock is taken first, then (when necessary) device lock.
(hash_lock is needed more often, so we split it up to reduce contention.
device lock is needed less often, but sometimes when hash_lock is held).

I've currently got:
spin_lock_init(conf->hash_locks);
for (i = 1; i < NR_STRIPE_HASH_LOCKS; i++)
spin_lock_init(conf->hash_locks + i);

and

spin_lock(conf->hash_locks);
for (i = 1; i < NR_STRIPE_HASH_LOCKS; i++)
spin_lock_nest_lock(conf->hash_locks + i, conf->hash_locks);
spin_lock(&conf->device_lock);

which doesn't trigger any lockdep warnings and isn't too ugly.

Does it seem OK to you?

Thanks,
NeilBrown


>
> Would be the better option; your suggestion might just work because
> NR_STRIP_HASH_LOCKS is 8 and we have exactly 8 subclasses available, but
> any increase to NR_STRIPE_HASH_LOCKS will make things explode again.
>
> The spin_lock_nest_lock() annotation tells that the lock order is
> irrelevant because all such multiple acquisitions are serialized under
> the other lock.
>
> Also, if in future you feel the need to increase NR_STRIP_HASH_LOCKS,
> please keep it <= 64 or so; if you have a need to go above that, please
> yell and we'll see if we can do something smarter.

I've added a comment to this effect in the code.


>
> This is because of:
> - each spin_lock() increases preempt_count and that's 8 bits; we
> wouldn't want to overflow that
> - each consecutive nested spin_lock() increases the total acquisition
> wait-time for all locks. Note that the worst case acquisition time
> for even a single hash lock is gated by the complete acquisition time
> of all of them in this scenario.

Attachment: signature.asc
Description: PGP signature