Re: [PATCH RFC ticketlock] Auto-queued ticketlock
From: Al Viro
Date: Wed Jun 12 2013 - 20:50:00 EST
On Wed, Jun 12, 2013 at 05:38:13PM -0700, Linus Torvalds wrote:
> On Wed, Jun 12, 2013 at 5:20 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Actually, dget_parent() change might be broken; the thing is, the assumptions
> > are more subtle than "zero -> non-zero only happens under ->d_lock". It's
> > actually "new references are grabbed by somebody who's either already holding
> > one on the same dentry _or_ holding ->d_lock". That's what d_invalidate()
> > check for ->d_count needs for correctness - caller holds one reference, so
> > comparing ->d_count with 2 under ->d_lock means checking that there's no other
> > holders _and_ there won't be any new ones appearing.
>
> For the particular case of dget_parent() maybe dget_parent() should
> just double-check the original dentry->d_parent pointer after getting
> the refcount on it (and if the parent has changed, drop the refcount
> again and go to the locked version). That might be a good idea anyway,
> and should fix the possible race (which would be with another cpu
> having to first rename the child to some other parent, and the
> d_invalidate() the original parent)
Yes, but... Then we'd need to dput() that sucker if we decide we shouldn't
have grabbed that reference, after all, which would make dget_parent()
potentially blocking.
> That said, the case we'd really want to fix isn't dget_parent(), but
> just the normal RCU lookup finishing touches (the__d_rcu_to_refcount()
> case you already mentioned) . *If* we could do that without ever
> taking the d_lock on the target, that would be lovely. But it would
> seem to have the exact same issue. Although maybe the
> dentry_rcuwalk_barrier() thing ends up solving it (ie if we had a
> lookup at a bad time, we know it will fail the sequence count test, so
> we're ok).
Maybe, but that would require dentry_rcuwalk_barrier() between any such
check and corresponding grabbing of ->d_lock done for it, so it's not
just d_invalidate().
> Subtle, subtle.
Yes ;-/ The current variant is using ->d_lock as a brute-force mechanism
for avoiding all that fun, and I'm not sure that getting rid of it would
buy us enough to make it worth the trouble. I'm absolutely sure that if
we go for that, we _MUST_ document the entire scheme as explicitly as
possible, or we'll end up with the shitload of recurring bugs in that
area. Preferably with the formal proof of correctness spelled out somewhere...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/