Re: [PATCH v2 00/41] filelock: split struct file_lock into file_lock and file_lease structs
From: Jeff Layton
Date: Thu Jan 25 2024 - 19:03:13 EST
On Fri, 2024-01-26 at 09:34 +1100, NeilBrown wrote:
> On Fri, 26 Jan 2024, Chuck Lever wrote:
> > On Thu, Jan 25, 2024 at 05:42:41AM -0500, Jeff Layton wrote:
> > > Long ago, file locks used to hang off of a singly-linked list in struct
> > > inode. Because of this, when leases were added, they were added to the
> > > same list and so they had to be tracked using the same sort of
> > > structure.
> > >
> > > Several years ago, we added struct file_lock_context, which allowed us
> > > to use separate lists to track different types of file locks. Given
> > > that, leases no longer need to be tracked using struct file_lock.
> > >
> > > That said, a lot of the underlying infrastructure _is_ the same between
> > > file leases and locks, so we can't completely separate everything.
> > >
> > > This patchset first splits a group of fields used by both file locks and
> > > leases into a new struct file_lock_core, that is then embedded in struct
> > > file_lock. Coccinelle was then used to convert a lot of the callers to
> > > deal with the move, with the remaining 25% or so converted by hand.
> > >
> > > It then converts several internal functions in fs/locks.c to work
> > > with struct file_lock_core. Lastly, struct file_lock is split into
> > > struct file_lock and file_lease, and the lease-related APIs converted to
> > > take struct file_lease.
> > >
> > > After the first few patches (which I left split up for easier review),
> > > the set should be bisectable. I'll plan to squash the first few
> > > together to make sure the resulting set is bisectable before merge.
> > >
> > > Finally, I left the coccinelle scripts I used in tree. I had heard it
> > > was preferable to merge those along with the patches that they
> > > generate, but I wasn't sure where they go. I can either move those to a
> > > more appropriate location or we can just drop that commit if it's not
> > > needed.
> > >
> > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> >
> > v2 looks nicer.
> >
> > I would add a few list handling primitives, as I see enough
> > instances of list_for_each_entry, list_for_each_entry_safe,
> > list_first_entry, and list_first_entry_or_null on fl_core.flc_list
> > to make it worth having those.
> >
> > Also, there doesn't seem to be benefit for API consumers to have to
> > understand the internal structure of struct file_lock/lease to reach
> > into fl_core. Having accessor functions for common fields like
> > fl_type and fl_flags could be cleaner.
>
> I'm not a big fan of accessor functions. They don't *look* like normal
> field access, so a casual reader has to go find out what the function
> does, just to find the it doesn't really do anything.
I might have been a bit too hasty with the idea. I took a look earlier
today and it gets pretty ugly trying to handle these fields with
accessors. flc_flags, for instance will need both a get and a set
method, which gets wordy after a while.
Some of the flc_list accesses don't involve list walks either so I don't
think we'll ever be able to make this "neat" without a ton of one-off
accessors.
> But neither am I a fan have requiring filesystems to use
> "fl_core.flc_foo". As you say, reaching into fl_core isn't ideal.
>
I too think it's ugly.
> It would be nice if we could make fl_core and anonymous structure, but
> that really requires -fplan9-extensions which Linus is on-record as not
> liking.
> Unless...
>
> How horrible would it be to use
>
> union {
> struct file_lock_core flc_core;
> struct file_lock_core;
> };
>
> I think that only requires -fms-extensions, which Linus was less
> negative towards. That would allow access to the members of
> file_lock_core without the "flc_core." prefix, but would still allow
> getting the address of 'flc_core'.
> Maybe it's too ugly.
>
I'd rather not rely on special compiler flags.
> While fl_type and fl_flags are most common, fl_pid, fl_owner, fl_file
> and even fl_wait are also used. Having accessor functions for all of those
> would be too much I think.
>
Some of them need setters too, and some like fl_flags like to be able to
do this:
fl->fl_flags |= FL_SLEEP;
That's hard to deal with in an accessor unless you want to do it with
macros or something.
> Maybe higher-level functions which meet the real need of the filesystem
> might be a useful approach:
>
> locks_wakeup(lock)
> locks_wait_interruptible(lock, condition)
> locks_posix_init(lock, type, pid, ...) ??
> locks_is_unlock() - fl_type is compared with F_UNLCK 22 times.
>
> While those are probably a good idea, through don't really help much
> with reducing the need for accessor functions.
>
I can take a look at some of those. Reducing the number of instances can
only help.
> I don't suppose we could just leave the #defines in place? Probably not
> a good idea.
>
> Maybe spell "fl_core" as "c"? lk->c.flc_flags ???
>
It's at least a little shorter. I can make that change if it's
preferred.
>
> And I wonder if we could have a new fl_flag for 'FOREIGN' locks rather
> than encoding that flag in the sign of the pid. That seems a bit ...
> clunky?
>
The kernel just treats the fl_pid as an opaque value that gets reported
to various consumers. Having it encoded in the sign is actually more
convenient, since reporting "foreign" lock holders as negative pid
values has some precedent in Unix.
flock and posix locks conflict on BSD, and the POSIX lock API reports
fl_pid as '-1' when there is a conflicting flock lock. I think solaris
may also report remote NFS locks as negative numbers too? (not certain
there).
So it works in our favor in this case, but it is a hack.
Now that I look too, I'm not sure why fl_pid is unsigned given that
pid_t is signed. I'll have to look into that as well.
--
Jeff Layton <jlayton@xxxxxxxxxx>