Re: [RFC PATCH 0/5] locks: implement "filp-private" (aka UNPOSIX)locks

From: Volker Lendecke
Date: Tue Oct 15 2013 - 04:57:14 EST


On Mon, Oct 14, 2013 at 08:23:03AM -0700, Frank Filz wrote:
> > http://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2
> > > > .html
> > > >
> > > > See the section entitled "First Implementation Past the Post".
> > >
> > > Interesting that Jeremy actually suggested the implementation should
> > > have had an arbitrary lock owner as part of the flock structure:
> > >
> > > "This is an example of a POSIX interface not being future-proofed
> > > against modern techniques such as threading. A simple amendment to the
> > > original primitive allowing a user-defined "locking context" (like a
> > > process id) to be entered in the struct flock structure used to define
> > > the lock would have fixed this problem, along with extra flags
> > > allowing the number of locks per context to be recorded if needed."
> > >
> > > But I'm happy with the lock context per kernel struct file as a
> > > solution, especially since that will allow locks to be sensibly passed
> > > to a forked process.
> > >
> > > Another next step would be an asynchronous blocking lock...
> >
> > Yes, please :-)
>
> What model would be useful to you (and for what project)? One thing I could

It's ctdb that would be mainly interested in this. ctdb
deals a lot with out tdb files, a shared mmap key/value
database protected by fcntl locks. ctdb is the database
daemon distributing records in a cluster. It is a
single-threaded async event loop, and it has to fork helper
processes waiting for locks.

> think of is since we have a file descriptor for each lock owner/file pair,
> we could do something like select on those descriptors, got to think about
> how that would actually work though... The vfs lock layer does inherently
> support a kernel call back when a blocked lock can be unblocked, so we just
> need to figure out the best way to hook that up to user space in a way that
> doesn't require a thread per blocked lock.

A model that would probably work for us is one file
descriptor that becomes readable when one of the blocking
lock states changes. To signal which one changed, I think
passing an opaque uint64 (usable as a pointer) for the
blocking lock would be great, or possibly something like
epoll_data_t. We would pass this in the fcntl call and read
it from the signal, possibly together with an errno
(EDEADLK?). Not sure if this is feasible kernel-side, but I
believe this is something that would work for us user-side.

Volker

--
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:kontakt@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/