Re: [RFC, PATCH] locks: remove posix deadlock detection

From: J. Bruce Fields
Date: Sun Oct 28 2007 - 21:13:44 EST


On Sun, Oct 28, 2007 at 06:40:52PM +0000, Alan Cox wrote:
> On Sun, 28 Oct 2007 12:27:32 -0600
> Matthew Wilcox <matthew@xxxxxx> wrote:
>
> > On Sun, Oct 28, 2007 at 01:43:21PM -0400, J. Bruce Fields wrote:
> > > We currently attempt to return -EDEALK to blocking fcntl() file locking
> > > requests that would create a cycle in the graph of tasks waiting on
> > > locks.
> > >
> > > This is inefficient: in the general case it requires us determining
> > > whether we're adding a cycle to an arbitrary directed acyclic graph.
> > > And this calculation has to be performed while holding a lock (currently
> > > the BKL) that prevents that graph from changing.
> > >
> > > It has historically been a source of bugs; most recently it was noticed
> > > that it could loop indefinitely while holding the BKL.
> >
> > It can also return -EDEADLK spuriously. So yeah, just kill it.
>
> NAK. This is an ABI change. It was also comprehensively rejected before
> because
>
> - EDEADLK behaviour is ABI
> - EDEADLK behaviour is required by SuSv3
> - We have no idea what applications may rely on this behaviour.

On the second point I think you're mistaken; see

http://www.opengroup.org/onlinepubs/009695399/functions/fcntl.html

"Since implementation of full deadlock detection is not
always feasible, the [EDEADLK] error was made optional."

The third objection is the one that concerns me most, and is the one I'd
like to understand better. So I'd be interested in any evidence of
applications that do actually depend on the current behavior. (Even
hypothetical use cases might be interesting.) I'm also curious what
other OS's do.

> See the thread
> http://osdir.com/ml/file-systems/2004-06/msg00017.html
>
> so we need to fix the bugs - the lock usage and the looping. At that
> point it merely becomes a performance concern to those who use it, which
> is the proper behaviour. If you want a faster non-checking one use
> flock(), or add another flag that is a Linux "don't check for deadlock"

That's an interesting idea. The flag might not help as much, since
looking for a cycle in the graph of blocked requests may be more
difficult in the case where we don't know whether the graph is acyclic
to start out with. (But I don't know--I haven't thought about it much.)

--b.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/