Re: [GIT PULL] vfs fixes

From: Linus Torvalds
Date: Mon Mar 18 2024 - 15:14:48 EST


On Mon, 18 Mar 2024 at 05:20, Christian Brauner <brauner@xxxxxxxxxx> wrote:
>
> * Take a passive reference on the superblock when opening a block device
> so the holder is available to concurrent callers from the block layer.

So I've pulled this, but I have to admit that I hate it.

The bdev "holder" logic is an abomination. And "struct blk_holder_ops"
is horrendous.

Afaik, we have exactly two cases of "struct blk_holder_ops" in the
whole kernel, and you edited one of them.

And the other one is in bcachefs, and is a completely empty one with
no actual ops, so I think that one shouldn't exist.

In other words, we have only *one* actual set of "holder ops". That
makes me suspicious in the first place.

Now, let's then look at that new "holder->put_holder" use. It has
_one_ single user too, which is bd_end_claim(), which is called from
one place, which is bdev_release(). Which in turn is called from
exactly one place, which is blkdev_release(). Which is the release
function for def_blk_fops. Which is called from __fput() on the last
release of the file.

Fine, fine, fine. So let's chase down *who* actually uses that single
"blk_holder_ops". And it turns out that it's used in three places:
fs/super.c, fs/ext4/super.c, and fs/xfs/xfs_super.c.

So in those three cases, it would be absolutely *wrong* if the
'holder' was anything but the super-block (because that's what the new
get/put functions require for any of this to work.

This all smells horribly bad to me. The code looks and acts like it is
some generic interface, but in reality it really isn't. Yes, bcachefs
seems to make up some random holder (it's a one-byte kmalloc that
isn't actually used), and a random holder op structure (it's empty, as
mentioned), but none of this makes any sense at all.

I get the feeling that the "get/put" operations should just be done in
the three places that currently use that 'fs_holder_ops'.

IOW, isn't the 'get()' always basically paired with the mounting? And
the 'put()' would probably be best done iin kill_block_super()?

I don't know. Maybe I missed something really important, but this
smells like a specific case that simply shouldn't have gotten this
kind of "generic infrastructure" solution.

Linus