Re: [RFC PATCH 4/4] vfs: add filesystem freeze/thaw callbacks for power management
From: Christian Brauner
Date: Sat Mar 29 2025 - 04:24:10 EST
On Fri, Mar 28, 2025 at 12:15:55PM -0400, James Bottomley wrote:
> On Fri, 2025-03-28 at 16:52 +0100, Christian Brauner wrote:
> [...]
> >
> > > various operations from races. Taken exclusively with down_write
> > > and shared with down_read. Private functions internal to super.c
> > > wrap this with grab_super and super_lock_shared/excl() wrappers.
> >
> > See also the Documentation/filesystems/lock I added.
>
> you mean locking.rst which covers s_umount? It would be nice to add
> the others as well.
>
> > > The explicit freeze/thaw_super() functions require the s_umount
> > > rwsem in down_write or exclusive mode and take it as the first step
> > > in their operation. Looking at the locking in
> > > fs_bdev_freeze/thaw() implies that the super_operations
> > > freeze_super/thaw_super *don't* need this taken (presumably they
> > > handle it internally).
> >
> > Block device locking cannot acquire the s_umount as that would cause
> > lock inversion with the block device open_mutex. The locking scheme
> > using sb_lock and the holder mutex allow safely acquiring the
> > superblock. It's orthogonal to what you're doing though.
>
> OK, but based on the above and the fact that the code has to call
> either the super op freeze/thaw_super or the global call, I think this
> can be handled in the callback as something like rather than trying to
> thread an exclusive s_umount:
Eww, no. We're not going to open-code that in two different places.
> static void filesystems_thaw_callback(struct super_block *sb)
> {
> if (unlikely(!atomic_inc_not_zero(&sb->s_active)))
> return;
>
> if (sb->s_op->thaw_super)
> sb->s_op->thaw_super(sb, FREEZE_MAY_NEST
> | FREEZE_HOLDER_KERNEL
> | freeze_flags);
> else if (sb->s_bdev)
> thaw_super(sb, FREEZE_MAY_NEST | FREEZE_HOLDER_KERNEL
> | freeze_flags);
>
> deactivate_super(sb);
> }
This is broken. The freeze/thaw functions cannot be called with s_umount
held otherwise they deadlock. And not holding s_umount while taking an
active reference count isn't supported as we're optimistically dropping
reference counts. We're not introducing exceptions to that scheme for no
good reason.
The other option is to move everything into the caller and bring back
get_active_super() and then add SUPER_ITER_UNLOCKED instead of
SUPER_ITER_GRAB. That's what I've done now.