Re: [PATCH] fhandle: fix UAF due to unlocked ->mnt_ns read in may_decode_fh()

From: Al Viro

Date: Wed Jun 03 2026 - 15:06:34 EST


On Wed, Jun 03, 2026 at 07:53:24PM +0100, Al Viro wrote:
> On Wed, Jun 03, 2026 at 08:46:07PM +0200, Jann Horn wrote:
> > On Wed, Jun 3, 2026 at 8:24 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > > On Wed, Jun 03, 2026 at 07:15:23PM +0100, Al Viro wrote:
> > > > On Wed, Jun 03, 2026 at 07:38:06PM +0200, Jann Horn wrote:
> > > >
> > > > > Fix it by taking rcu_read_lock() around the mount::mnt_ns access, like
> > > > > in __prepend_path().
> > > >
> > > > > + /*
> > > > > + * Containing namespace.
> > > > > + * Normally protected by namespace_sem, but there are also lockless
> > > > > + * readers (which must use RCU to guard against the namespace being
> > > > > + * freed).
> > > > > + */
> > > > > + struct mnt_namespace *mnt_ns;
> > > >
> > > > Umm... It's somewhat subtle - at the very least you need to explain why
> > > > there will be an RCU delay between umount_tree() clearing that and
> > > > having the sucker freed.
> > >
> > > Something along the lines of "removals from namespace are serialized on
> > > namespace_sem and guaranteed to happen no later than the active
> > > refcount on namespace reaches zero; freeing of namespace happens only
> > > after the passive refcount hitting zero and there's an RCU delay between
> > > dropping the last active ref and dropping the passive one that had been
> > > implicitly held by the fact of having actives", perhaps? Only in
> > > more readable form than that, please...
> >
> > Hm, like this?
> >
> > Containing namespace (active).
>
> Umm... That's actually "active or has _just_ dropped the last active
> reference and didn't get around to scheduling decrement of passive refcount
> yet", unfortunately.
>
> Hell knows - "active or deactivating", perhaps?

Note that "active" in such context is easy to mistake for "active reference",
which it definitely isn't - it does not contribute to active refcount.
Mounts within a namespace do not pin it - it's the other way round; they
are guaranteed to stay live until they leave the sucker. Anything that
hasn't left by the time the active refcount of namespace drops to zero
will get pushed out (and killed off unless there are other references to
any such mounts)