Re: deadlock in synchronize_srcu() in debugfs?

From: Johannes Berg
Date: Thu Mar 30 2017 - 07:11:30 EST


On Thu, 2017-03-30 at 12:27 +0200, Nicolai Stange wrote:
> So, please correct me if I'm wrong, there are two problems with
> indefinitely blocking debugfs files' fops:
>
> 1. The one which actually hung your system:
> ÂÂÂAn indefinitely blocking debugfs_remove() while holding a lock.
> ÂÂÂOther tasks attempting to grab that same lock get stuck as well.
>
> 2. The other one you've found, namely that the locking granularity is
> ÂÂÂtoo coarse: a debugfs_remove() would get blocked by unrelated
> files'
> ÂÂÂpending fops.

No, this isn't really an accurate description of the two problems.

> AFAICS, the first one can't get resolved by simply refining the
> blocking granularity: a debugfs_remove() on the indefinitely blocking
> file would still block as well.

Correct.

The first problem - the one I ran into - is the following:

1)
A given debugfs file's .read() was waiting for some event to happen
(being a blocking file), and I was trying to debugfs_remove() some
completely unrelated file, this got stuck.
Due to me holding a lock while doing this debugfs_remove(), other tasks
*also* got stuck, but that's just a sub-problem - having the
debugfs_remove() of an unrelated file get stuck would already have been
a problem - the fact that other tasks also got stuck was just an
additional wrinkle.

Mind - this is a livelock of sorts - if the debugfs file will ever make
progress, the system can recover.

2)
There's a complete deadlock situation if this happens:

CPU1 CPU2

debugfs_file_read(file="foo") mutex_lock(&M);
srcu_read_lock(&debugfs_srcu); debugfs_remove(file="bar")
mutex_lock(&M); synchronize_srcu(&debugfs_srcu)

This is intrinsically unrecoverable.

> > Yes, I'm pretty much convinced that it is. I considered doing a
> > deferred debugfs_remove() by holding the object around, but then I
> > can't be sure that I can later re-add a new object with the same
> > directory name,
>
> Ah, I see. What about making debugfs provide separate
>
> - debugfs_remove_start(): unlink file (c.f. __debugfs_remove()), can
> be
> Â called under a lock
> and
> - debugfs_remove_wait(): the synchronize_srcu(), must not get called
> Â under any lock
>
> ?

I don't think it would really help much - the lock acquisition in my
case is in a completely different layer (cfg80211) than the code doing
debugfs_remove(), so delaying the debugfs_remove_wait() would mean
moving it somewhere else completely. Also, afaict you still have to
keep the object around until debugfs_remove_wait() has finished, so you
still have the name reuse problem.

> In short, I'd make calling debugfs_remove() with any lock being
> held illegal.
>
> What do you think?

I think I'll stop using debugfs if that happens - too much hassle.

> > Half the thread here was about that - it's not easily doable
> > because
> > you'd have to teach lockdep about the special SRCU semantics first.
> > Since it doesn't even seem to do read/write locks properly that's
> > probably a massive undertaking.
>
> I haven't looked into lockdep yet. So there is no way to ask lockdep
> "is there any lock held?" from debugfs_remove() before doing the
> synchonize_srcu()?Â

That's probably possible, yes.

> > > > Similarly, nobody should be blocking in debugfs files, like we
> > > > did
> > > > in ours, but also smsdvb_stats_read(), crtc_crc_open() look
> > > > like
> > > > they could block for quite a while.
> > >
> > > Blocking in the debugfs files' fops shall be fine by itself,
> > > that's why SRCU is used for the removal stuff.
> >
> > No, it isn't fine at all now!
>
> "Shall" in the sense of "it's a requirement" and if it isn't
> fulfilled, it must be fixed. So I do agree with you here.

Ok. So let's assume that we allow blocking (indefinitely, at least
until you remove the file) for a debugfs file - then immediately due to
the way SRCU is used you've now made some debugfs_remove() calls block
indefinitely. Not just on the same file - that'd be fine and a bug,
because before you remove a file you should wake it up - but any other
file in the system.

Even splitting it into debugfs_remove_start() and debugfs_remove_wait()
will not do anything to fix this problem - debugfs_remove_wait() would
then block forever and the task that called it will not be able to make
any forward progress, until the completely unrelated other files
created some data or got closed.

This is the core of the problem really - that you're tying completely
unrelated processes together.

Therefore, to continue using SRCU in this way means that you have to
disallow blocking debugfs files. There may not be many of those, but
any single one of them would be a problem.

If we stop using SRCU this way we can discuss how we can fix it - but
anything more coarse grained than per-file (which really makes SRCU
unsuitable) would still have the same problem one way or another. And
we haven't even addressed the deadlock situation (2 above) either.

> When I did this, per-file reader/writer locks actuallt came to my
> mind first. The problem here is that debugfs_use_file_start() must
> grab the lock first and check whether the file has been deleted in
> the meanwhile. But as it stands, there's nothing that would guarantee
> the existence of the lock at the time it's to be taken.

That seems like a strange argument to me - something has to exist for a
process to be able to look up the file, and currently the proxy also
has to exist?

So when a file is created you can allocate the proxy for it, and if you
can look up the proxy object - perhaps even using plain RCU - then you
also have the lock? IOW, instead of storing just the real_fops in
d_fsdata, you can store a small object that holds a lock and the
real_fops. You can always access that object, and lock it, but the
real_fops inside it might eventually end up NULL which you handle
through proxying. No?

johannes