Re: [resend PATCH v2 2/2] fuse: ensure that submounts lookup their parent
From: Krister Johansen
Date: Mon Oct 09 2023 - 22:43:32 EST
On Mon, Oct 09, 2023 at 09:45:08PM +0200, Miklos Szeredi wrote:
> On Mon, 2 Oct 2023 at 17:24, Krister Johansen <kjlx@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > The submount code uses the parent nodeid passed into the function in
> > order to create the root dentry for the new submount. This nodeid does
> > not get its remote reference count incremented by a lookup option.
> >
> > If the parent inode is evicted from its superblock, due to memory
> > pressure for example, it can result in a forget opertation being sent to
> > the server. Should this nodeid be forgotten while it is still in use in
> > a submount, users of the submount get an error from the server on any
> > subsequent access. In the author's case, this was an EBADF on all
> > subsequent operations that needed to reference the root.
> >
> > Debugging the problem revealed that the dentry shrinker triggered a forget
> > after killing the dentry with the last reference, despite the root
> > dentry in another superblock still using the nodeid.
>
> There's some context missing here. There are two dentries: a mount
> point in the parent mount and the root of the submount.
>
> The server indicates that the looked up inode is a submount using
> FUSE_ATTR_SUBMOUNT. Then AFAICS the following happens:
>
> 1) the mountpoint dentry is created with nlookup = 1. The
> S_AUTOMOUNT flag is set on the mountpoint inode.
>
> 2) the lookup code sees S_AUTOMOUNT and triggers the submount
> operation, which sets up the new super_block and the root dentry with
> the user supplied nodeid and with nlookup = 0 (because it wasn't
> actually looked up).
>
> How the automount gets torn down is another story. You say that the
> mount point gets evicted due to memory pressure. But it can't get
> evicted while the submount is attached. So the submount must first
> get detached, and then the mount point can be reclaimed. The
> question is: how does the submount gets detached. Do you have an
> idea?
Apologies for not stating this clearly. The use case is a container
running in a VM, and the container's root is provided to the guest via
virtiofs. I believe the submount is getting detached as part of the
container setup, either via a umount2(MNT_DETACH) of the old root
filesystem, or as part of pivot_root() itself. By the time I'm able to
inspect the dentry associated with the submount in the initial mount ns
(case #1) its d_lockref.count is 0, and /proc/mountinfo doesn't show an
active mount for the submount in that mount namespace.
If I manually traverse the path to the submount via something like cd
and ls from the initial mount namespace, it'll stay referenced until I
run a umount for the automounted path. I'm reasonably sure it's the
container setup that's causing the detaching.
I'm happy to go debug this some more, though, if you're skeptical of the
explanation.
-K