Re: [PATCH 09/19] VFS: add _async versions of the various directory modifying inode_operations

From: Al Viro
Date: Sat Feb 08 2025 - 23:57:20 EST


On Sun, Feb 09, 2025 at 01:09:10AM +0000, Al Viro wrote:
> On Fri, Feb 07, 2025 at 10:41:34PM +0000, Al Viro wrote:
>
> > I'm sorry, but I don't buy the "complete with no lock on directory"
> > part - not without a verifiable proof of correctness of the locking
> > scheme. Especially if you are putting rename into the mix.
> >
> > And your method prototypes pretty much bake that in.
> >
> > *IF* we intend to try going that way (and I'm not at all convinced
> > that it's feasible - locking aside, there's also a shitload of fun
> > with fsnotify, audit, etc.), let's make those new methods take
> > a single argument - something like struct mkdir_args, etc., with
> > inlines for extracting individual arguments out of that. Yes, it's
> > ugly, but it allows later changes without a massive headache on
> > each calling convention modification.
> >
> > Said that, an explicit description of locking scheme and a proof of
> > correctness (at least on the "it can't deadlock" level) is, IMO,
> > a hard requirement for the entire thing, async or no async.
> >
> > We *do* have such for the current locking scheme.
>
> While we are at it, the locking order is... interesting. You
> have
> * parent's ->i_rwsem before child's d_update_lock()
> * for a child, d_update_lock() before ->i_rwsem
> and that - on top of ordering between ->i_rwsem of various
> inodes.
>
> Do you actually have a proof that it's deadlock-free?

Note that "child's d_update_lock()" might very well be sleeping
on something that is no longer the parent's child, so the
ordering by depth, with ->i_rwsem and d_update_lock interspersed
does not hold.

What am I missing here? I'd been trying to come up with
a proof of deadlock avoidance, but... no luck so far.