Re: [RFC PATCH 4/4] vfs: add filesystem freeze/thaw callbacks for power management
From: Christian Brauner
Date: Fri Mar 28 2025 - 11:52:55 EST
> Since this is a hybrid thread between power management and VFS, could I
> just summarize what I think the various superblock locks are before
> discussing the actual problem (important because the previous threads
> always gave the impression of petering out for fear of vfs locking).
>
> s_count: outermost of the superblock locks refcounting the superblock
> structure itself, making no guarantee that any of the underlying
> filesystem superblock structures are attached (i.e. kill_sb() may have
> been called). Taken by incrementing under the global sb_lock and
> decremented using a put_super() variant.
and protects the presence of the superblock on the global super lists.
>
> s_active: an atomic reference counting the underlying filesystem
> specific superblock structures. if you hold s_active, kill_sb cannot
> be called. Acquired by atomic_inc_not_zero() with a possible failure
> if it is zero and released by deactivate_super() and its variants.
or deactivate_locked_super() depending on whether s_umount is held or
not.
>
> s_umount: rwsem and innermost of the superblock locks. Used to protect
No, it's not innermost. super_lock is a spinlock and obviously doesn't
nest with the semaphore. It's almost always the outmost lock for what
we're discussing here. Even is the outermost lock with most block device
locks.
It's also intimately tied into mount code and has implications for the
dcache and icache. That's all orthogonal to this thread.
> various operations from races. Taken exclusively with down_write and
> shared with down_read. Private functions internal to super.c wrap this
> with grab_super and super_lock_shared/excl() wrappers.
See also the Documentation/filesystems/lock I added.
>
> The explicit freeze/thaw_super() functions require the s_umount rwsem
> in down_write or exclusive mode and take it as the first step in their
> operation. Looking at the locking in fs_bdev_freeze/thaw() implies
> that the super_operations freeze_super/thaw_super *don't* need this
> taken (presumably they handle it internally).
Block device locking cannot acquire the s_umount as that would cause
lock inversion with the block device open_mutex. The locking scheme
using sb_lock and the holder mutex allow safely acquiring the
superblock. It's orthogonal to what you're doing though.