Re: [PATCH] fhandle: fix UAF due to unlocked ->mnt_ns read in may_decode_fh()

From: Jann Horn

Date: Wed Jun 03 2026 - 15:10:16 EST


On Wed, Jun 3, 2026 at 9:02 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Wed, Jun 03, 2026 at 07:53:24PM +0100, Al Viro wrote:
> > On Wed, Jun 03, 2026 at 08:46:07PM +0200, Jann Horn wrote:
> > > On Wed, Jun 3, 2026 at 8:24 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > > > On Wed, Jun 03, 2026 at 07:15:23PM +0100, Al Viro wrote:
> > > > > On Wed, Jun 03, 2026 at 07:38:06PM +0200, Jann Horn wrote:
> > > > >
> > > > > > Fix it by taking rcu_read_lock() around the mount::mnt_ns access, like
> > > > > > in __prepend_path().
> > > > >
> > > > > > + /*
> > > > > > + * Containing namespace.
> > > > > > + * Normally protected by namespace_sem, but there are also lockless
> > > > > > + * readers (which must use RCU to guard against the namespace being
> > > > > > + * freed).
> > > > > > + */
> > > > > > + struct mnt_namespace *mnt_ns;
> > > > >
> > > > > Umm... It's somewhat subtle - at the very least you need to explain why
> > > > > there will be an RCU delay between umount_tree() clearing that and
> > > > > having the sucker freed.
> > > >
> > > > Something along the lines of "removals from namespace are serialized on
> > > > namespace_sem and guaranteed to happen no later than the active
> > > > refcount on namespace reaches zero; freeing of namespace happens only
> > > > after the passive refcount hitting zero and there's an RCU delay between
> > > > dropping the last active ref and dropping the passive one that had been
> > > > implicitly held by the fact of having actives", perhaps? Only in
> > > > more readable form than that, please...
> > >
> > > Hm, like this?
> > >
> > > Containing namespace (active).
> >
> > Umm... That's actually "active or has _just_ dropped the last active
> > reference and didn't get around to scheduling decrement of passive refcount
> > yet", unfortunately.

Ah, right, I see, because the mounts of non-anonymous namespaces are
only cleared in put_mnt_ns() after the active reference drop.

> > Hell knows - "active or deactivating", perhaps?
>
> Note that "active" in such context is easy to mistake for "active reference",
> which it definitely isn't - it does not contribute to active refcount.
> Mounts within a namespace do not pin it - it's the other way round; they
> are guaranteed to stay live until they leave the sucker. Anything that
> hasn't left by the time the active refcount of namespace drops to zero
> will get pushed out (and killed off unless there are other references to
> any such mounts)

(And there's also that weird detail of how, for anonymous namespaces,
the active refcount isn't used and AFAICS never actually drops to
zero...)

So I guess I'll write "Containing namespace (active or deactivating,
non-refcounted)."?