Re: [syzbot] WARNING in mntput_no_expire (2)

From: Christian Brauner
Date: Tue Apr 06 2021 - 08:35:21 EST


On Tue, Apr 06, 2021 at 01:38:39AM +0000, Al Viro wrote:
> On Mon, Apr 05, 2021 at 10:07:37PM +0200, Christian Brauner wrote:
>
> > > diff --git a/fs/namei.c b/fs/namei.c
> > > index 216f16e74351..82344f1139ff 100644
> > > --- a/fs/namei.c
> > > +++ b/fs/namei.c
> > > @@ -2289,6 +2289,9 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
> > > int error;
> > > const char *s = nd->name->name;
> > >
> > > + nd->path.mnt = NULL;
> > > + nd->path.dentry = NULL;
> > > +
> > > /* LOOKUP_CACHED requires RCU, ask caller to retry */
> > > if ((flags & (LOOKUP_RCU | LOOKUP_CACHED)) == LOOKUP_CACHED)
> > > return ERR_PTR(-EAGAIN);
> > > @@ -2322,8 +2325,6 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
> > > }
> > >
> > > nd->root.mnt = NULL;
> > > - nd->path.mnt = NULL;
> > > - nd->path.dentry = NULL;
> > >
> > > /* Absolute pathname -- fetch the root (LOOKUP_IN_ROOT uses nd->dfd). */
> > > if (*s == '/' && !(flags & LOOKUP_IN_ROOT)) {
> >
> > Bingo. That fixes it.
>
> *grumble*
>
> OK, I suppose it'll do for backports, but longer term... I don't like how
> convoluted the rules for nameidata fields' validity are. In particular,
> for nd->path I would rather have it
> * cleared in set_nameidata()
> * cleared when it become invalid. That would be
> * places that drop rcu_read_lock() without having legitimized the sucker
> (already done, except for terminate_walk())
> * terminate_walk() in non-RCU case after path_put(&nd->path)
>
> OTOH... wait a sec - the whole thing is this cycle regression, so...
>
> Could you verify that the variant below fixes that crap?
>
> Make sure nd->path.mnt and nd->path.dentry are always valid pointers
>
> Initialize them in set_nameidata() and make sure that terminate_walk() clears them
> once the pointers become potentially invalid (i.e. we leave RCU mode or drop them
> in non-RCU one). Currently we have "path_init() always initializes them and nobody
> accesses them outside of path_init()/terminate_walk() segments", which is asking
> for trouble.
>
> With that change we would have nd->path.{mnt,dentry}
> 1) always valid - NULL or pointing to currently allocated objects.
> 2) non-NULL while we are successfully walking
> 3) NULL when we are not walking at all
> 4) contributing to refcounts whenever non-NULL outside of RCU mode.
>
> Hopefully-fixes: 6c6ec2b0a3e0 ("fs: add support for LOOKUP_CACHED")
> Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> ---
> diff --git a/fs/namei.c b/fs/namei.c
> index 216f16e74351..fc8760d4314e 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -579,6 +579,8 @@ static void set_nameidata(struct nameidata *p, int dfd, struct filename *name)
> p->stack = p->internal;
> p->dfd = dfd;
> p->name = name;
> + p->path.mnt = NULL;
> + p->path.dentry = NULL;
> p->total_link_count = old ? old->total_link_count : 0;
> p->saved = old;
> current->nameidata = p;
> @@ -652,6 +654,8 @@ static void terminate_walk(struct nameidata *nd)
> rcu_read_unlock();
> }
> nd->depth = 0;
> + nd->path.mnt = NULL;
> + nd->path.dentry = NULL;
> }
>
> /* path_put is needed afterwards regardless of success or failure */
> @@ -2322,8 +2326,6 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
> }
>
> nd->root.mnt = NULL;
> - nd->path.mnt = NULL;
> - nd->path.dentry = NULL;
>
> /* Absolute pathname -- fetch the root (LOOKUP_IN_ROOT uses nd->dfd). */
> if (*s == '/' && !(flags & LOOKUP_IN_ROOT)) {

Yeah, that works too.
Though I'm not a fan of this open-coding as it looks pretty brittle
especially if we ever introduce another LOOKUP_* flag that requires us
to clear some other field carefully. I think a tiny static inline void
helper might make it easier to grok what's going on.

And while we're at it might I bring up the possibility of an additional
cleanup of how we currently call path_init().
Right now we pass the return value from path_init() directly into e.g.
link_path_walk() which as a first thing checks for error. Which feels
rather wrong and has always confused me when looking at these codepaths.
I get that it might make sense for reasons unrelated to path_init() that
link_path_walk() checks its first argument for error but path_init()
should be checked for error right away especially now that we return
early when LOOKUP_CACHED is set without LOOKUP_RCU. I have two/three
additional patches on top of the one below that massage path_lookupat(),
path_openat() to get rid of such - imho - hard to parse while
constructs:

while (!(error = link_path_walk(s, nd)) &&
(s = <foo>(nd, file, op)) != NULL)
;

in favor of a for (;;) like we do in link_path_walk() itself already
with early exits to a cleanup label. I'm not too fond of the

if (!error)
error = foo();
if (!error)
error = bar();

terminate_walk();
return error;

thing especially in longer functions such as path_lookupat() where it
gets convoluted pretty quickly. I think it would be cleaner to have
something like [1]. The early exists make the code easier to reason
about imho. But I get that that's a style discussion. If you want to
see the whole thing and not just the patch below I'm happy to send it
though. Here's the tweaked fix for the immediate syzbot issue only: