Re: [PATCH RFC v2 08/18] fs: add dedicated block device open helpers for filesystems
From: Jan Kara
Date: Mon Jun 22 2026 - 12:30:49 EST
On Tue 16-06-26 16:08:24, Christian Brauner wrote:
> Add fs_bdev_file_open_by_{dev,path}() and fs_bdev_file_release(). They
> open the device with fs_holder_ops and register a claim in the
> device-to-superblock table. Claims on the same (device, superblock)
> pair share one entry, so when a filesystem claims a device it already
> uses (xfs with its log on the data device), no second entry is added
> and each superblock will be acted on once.
>
> The holder argument remains purely the block layer's exclusivity token:
> a superblock, or a file_system_type for a device shared by several
> superblocks of that type. The shared case only becomes usable once the
> fs_holder_ops callbacks resolve superblocks through the table instead
> of bdev->bd_holder.
>
> Convert the main device, setup_bdev_super() and kill_block_super(),
> over: the open finds the entry registered by sget_fc() and claims it
> again. cramfs and romfs bypass kill_block_super() so they can handle
> MTD mounts and release the main device with a plain bdev_fput(), which
> would leave the claim behind: the (dev, sb) entry would never be
> unregistered and the passive reference it holds would keep the
> superblock alive forever. Convert their release paths in the same
> step.
>
> The frozen-device check stays in setup_bdev_super() for the primary
> device and is added to fs_bdev_register() for new claims, i.e. every
> additional device a filesystem opens through the helpers. Only a
> (device, superblock) pair the superblock claimed earlier may be
> reopened while frozen (xfs with its log on the data device): the freeze
> already covers that superblock through the existing claim, so nothing
> escapes it. Without the setup_bdev_super() check a device frozen before
> the mount even started (dm lock_fs, loop) could be mounted and written
> to (journal replay) under an active freeze, because the primary open
> reuses the entry registered by sget_fc() and never takes the new-claim
> path.
>
> Both checks read bd_fsfreeze_count only after the entry is published
> (by sget_fc() for the primary, by fs_bdev_register() for new claims)
> and pair with bdev_freeze() incrementing the count before walking the
> table: either the mount sees the elevated freeze count and fails with
> EBUSY, or the freeze finds the published entry and converges once
> SB_BORN is set.
>
> Signed-off-by: Christian Brauner (Amutable) <brauner@xxxxxxxxxx>
...
> +static int fs_bdev_register(struct file *bdev_file, struct super_block *sb)
> +{
> + struct super_dev *sb_dev __free(kfree) = NULL;
Frankly I find the use of __free on sb_dev more confusing than helping in
this function. If you didn't use it, you could remove the somewhat
confusing retain_and_null_ptr() calls below, remove this initialization and
just put one kfree() into the error handling branch when super_dev_insert()
fails...
> + dev_t dev = file_bdev(bdev_file)->bd_dev;
> + int err;
> +
> + scoped_guard(rcu) {
> + sb_dev = super_dev_lookup(dev, sb);
> + if (sb_dev && refcount_inc_not_zero(&sb_dev->sd_ref)) {
> + retain_and_null_ptr(sb_dev);
> + return 0;
> + }
> + }
> +
> + sb_dev = super_dev_alloc(dev, sb);
> + if (!sb_dev)
> + return -ENOMEM;
> +
> + err = super_dev_insert(sb_dev);
> + if (err)
> + return err;
> +
> + /* Publish the entry before reading the count; pairs with bdev_freeze(). */
> + smp_mb();
> + if (atomic_read(&file_bdev(bdev_file)->bd_fsfreeze_count) > 0) {
> + err = -EBUSY;
> + super_dev_put(sb_dev);
> + }
> +
> + retain_and_null_ptr(sb_dev);
> + return err;
> +}
...
> +/**
> + * fs_bdev_file_release - release a block device claimed for a superblock
> + * @bdev_file: file returned by fs_bdev_file_open_by_{dev,path}()
> + * @sb: superblock the device was claimed for
> + *
> + * Drop one claim on the {dev, @sb} entry; the last claim unregisters it (a
> + * pinning cursor defers the actual unlink). Then close the block device.
> + */
> +void fs_bdev_file_release(struct file *bdev_file, struct super_block *sb)
> +{
> + dev_t dev = file_bdev(bdev_file)->bd_dev;
> + struct super_dev *sb_dev;
> +
> + rcu_read_lock();
> + sb_dev = super_dev_lookup(dev, sb);
> + rcu_read_unlock();
> + super_dev_put(sb_dev);
> + bdev_fput(bdev_file);
> +}
> +EXPORT_SYMBOL_GPL(fs_bdev_file_release);
Why don't you use sb->s_super_dev in this function?
Otherwise the patch looks good to me.
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR