Re: [PATCH 0/5] fuse: handle release synchronously (v4)

From: Maxim Patlasov
Date: Wed Oct 01 2014 - 07:28:30 EST

On 10/01/2014 12:44 AM, Linus Torvalds wrote:
On Tue, Sep 30, 2014 at 12:19 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
What about flock(2), FL_SETLEASE, etc semantics (which are the sane ones,
compared to the POSIX locks shit which mandates release of lock on each close(2)
instead of "when all [duplicate] descriptors have been closed")?

You have to do that from ->release(), there's no question about that.
We do locks_remove_file() independently on ->release, but yes, it's
basically done just before the last release.

But it has the *exact* same semantics as release, including very much
having nothing what-so-ever to do with "last close()".

If the file descriptor is opened for other reasons (ie mmap, /proc
accesses, whatever), then that delays locks_remove_file() the same way
it delays release.

None of that has *anothing* to do with "synchronous". Thinking it does is wrong.

And none of this has *anything* to do with the issue that Maxim
pointed to in the mailing list web page, which was about write caches,
and how you cannot (and MUST NOT) delay them until release time.

I apologise for mentioning that mailing list web page in my title message. This was really misleading, I had to think about it in advance. Of course, write caches must be flushed in scope of ->flush(), not ->release(). Let me please set forth an use-case that led me to those patches.

We implemented a FUSE-based distributed storage solution intended for keeping images of VMs (virtual machines) and their configuration files. The way how VMs use images makes exclusive-open()er semantics very attractive: while a VM is using its image on a node, the concurrent access from other nodes to that image is neither desirable nor necessary. So, we acquire an exclusive lease on FUSE_OPEN and release it on FUSE_RELEASE. This is quite natural and has obviously nothing to do with FUSE_FLUSH.

Following such semantics, there are two choices for handling open() if the file is currently exclusively locked by a remote node: (a) return EBUSY; (b) block until the remote node release the file. We decided for (a), because (b) is very inconvenient in practice: most applications handle failed open(2) properly, but very few are clever enough to spawn a separate thread with open() and kill it if the open() has not succeeded in a reasonable time.

The patches I sent make essentially one thing: they make FUSE ->release() wait for ACK from userspace before return. Without these patches, any attempt to test or use our storage in valid use-cases led to spurious EBUSY. For example, while migrating a VM from one node to another, we firstly close the image file on source node, then try to open it on destination node, but fail because FUSE_RELEASE is not processed by userspace on source node yet.

Given those patches must die, do you have any ideas how to resolve that "spurious EBUSY" problem?

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at