Re: [PATCH 05/19] fs: assign an anon dev_t in common code

From: Al Viro
Date: Wed Sep 13 2023 - 20:34:50 EST


On Wed, Sep 13, 2023 at 08:09:59AM -0300, Christoph Hellwig wrote:

> diff --git a/fs/super.c b/fs/super.c
> index bbe55f0651cca4..5c685b4944c2d6 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -787,7 +787,7 @@ struct super_block *sget_fc(struct fs_context *fc,
> struct super_block *s = NULL;
> struct super_block *old;
> struct user_namespace *user_ns = fc->global ? &init_user_ns : fc->user_ns;
> - int err;
> + int err = 0;
>
> retry:
> spin_lock(&sb_lock);
> @@ -806,14 +806,26 @@ struct super_block *sget_fc(struct fs_context *fc,
> }
>
> s->s_fs_info = fc->s_fs_info;
> - err = set(s, fc);
> - if (err) {
> - s->s_fs_info = NULL;
> - spin_unlock(&sb_lock);
> - destroy_unused_super(s);
> - return ERR_PTR(err);
> + if (set) {
> + err = set(s, fc);
> + if (err) {
> + s->s_fs_info = NULL;

Pointless (as the original had been); destroy_unused_super() doesn't
even look at ->s_fs_info.

> + goto unlock_and_destroy;
> + }
> }
> fc->s_fs_info = NULL;

Here we are transferring the ownership from fc to superblock; it used to sit
right next to insertion into lists and all failure exits past that point must
go through deactivate_locked_super(), so ->kill_sb() would be called on those
and it would take care of s->s_fs_info. However, your variant has that
ownership transfer done at the point before get_anon_bdev(), and that got
you a new failure exit where you are still calling destroy_unused_super():

> + if (!s->s_dev) {
> + /*
> + * If the file system didn't set a s_dev (which is usually only
> + * done for block based file systems), set an anonymous dev_t
> + * here now so that we always have a valid ->s_dev.
> + */
> + err = get_anon_bdev(&s->s_dev);
> + if (err)
> + goto unlock_and_destroy;

This. And that's a leak - fc has no reference left in it, and your
unlock_and_destroy won't even look at what's in ->s_fs_info, let alone know
what to do with it.

IOW, clearing fc->s_fs_info should've been done after that chunk.

And looking at the change in sget(),

> + if (set) {
> + err = set(s, data);
> + if (err)
> + goto unlock_and_destroy;
> }
> +
> + if (!s->s_dev) {
> + err = get_anon_bdev(&s->s_dev);
> + if (err)
> + goto unlock_and_destroy;
> + }

I'd rather expressed it (both there and in sget_fc() as well) as
if (set)
err = set(s, data);
if (!err && !s->s_dev)
err = get_anon_bdev(&s->s_dev);
if (err)
goto unlock_and_destroy;

That's really what your transformation does - you are lifting the
calls of get_anon_bdev() (in guise of set_anon_super()) from the
tails of 'set' callbacks into the caller, making them conditional
upon the lack of other errors from 'set' and upon the ->s_dev left
zero and allow NULL for the case when that was all that had been
there.

The only place where you do something different is this:

> @@ -1191,7 +1191,6 @@ static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc,
> static int ceph_set_super(struct super_block *s, struct fs_context *fc)
> {
> struct ceph_fs_client *fsc = s->s_fs_info;
> - int ret;
>
> dout("set_super %p\n", s);
>
> @@ -1211,11 +1210,7 @@ static int ceph_set_super(struct super_block *s, struct fs_context *fc)
> s->s_flags |= SB_NODIRATIME | SB_NOATIME;
>
> ceph_fscrypt_set_ops(s);
> -
> - ret = set_anon_super_fc(s, fc);
> - if (ret != 0)
> - fsc->sb = NULL;
> - return ret;
> + return 0;

fsc->sb = NULL has disappeared here; it *is* OK (the caller won't look at
fsc->sb after failed sget_fc()), but that's worth a mention somewhere.
A separate commit removing that clearing fsc->sb in ceph_set_super(),
perhaps?