[GIT PULL 10/16 for v7.2] vfs writeback
From: Christian Brauner
Date: Fri Jun 12 2026 - 11:14:42 EST
Hey Linus,
/* Summary */
This contains the writeback changes for this cycle:
* Fix a race between cgroup_writeback_umount() and inode_switch_wbs()
When a container exits, a race between cgroup_writeback_umount() and
inode_switch_wbs()/cleanup_offline_cgwb() can trigger "VFS: Busy
inodes after unmount" followed by a use-after-free on percpu
counters. There is a window between inode_prepare_wbs_switch()
returning true (having passed the SB_ACTIVE check and grabbed the
inode) and the subsequent wb_queue_isw() call: if
cgroup_writeback_umount() observes the global isw_nr_in_flight
counter as non-zero but flush_workqueue() finds nothing queued yet,
it returns early - leaving a held inode reference that blocks
evict_inodes() and a later iput() that hits freed percpu counters.
The race is closed by covering the window from
inode_prepare_wbs_switch() through wb_queue_isw() with an RCU
read-side critical section and synchronizing in the umount path. On
top of that the now-dead rcu_barrier() left over from the
queue_rcu_work() era is removed, and the global
synchronize_rcu()/flush_workqueue() pair is replaced with a per-sb
in-flight counter plus pin/unpin/drain helpers so umount no longer
serializes against switch activity on unrelated superblocks.
Under cgroup writeback churn on a 16 vCPU guest this takes umount
latency from ~92-138ms p50 down to ~5-8ms p50 and the cumulative
cost of cgroup_writeback_umount() from ~62ms to ~4us per call. The
initial race fix is kept separate and minimal so it backports
cleanly to stable trees that still queue switches via
queue_rcu_work().
* Improve write performance with RWF_DONTCACHE
Dirty DONTCACHE pages are now tracked per bdi_writeback so that the
writeback flusher can be kicked in a targeted fashion for
IOCB_DONTCACHE writes instead of relying on global writeback, and
the PG_dropbehind flag is preserved when a folio is split.
/* Testing */
gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)
No build failures or warnings were observed.
/* Conflicts */
Merge conflicts with mainline
=============================
No known conflicts.
Merge conflicts with other trees
================================
The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:
Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)
are available in the Git repository at:
git@xxxxxxxxxxxxxxxxxxx:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.writeback
for you to fetch changes up to 0275dc184aa007b260374af6d46fb15741c062a8:
Merge patch series "mm: improve write performance with RWF_DONTCACHE" (2026-06-04 10:18:25 +0200)
----------------------------------------------------------------
vfs-7.2-rc1.writeback
Please consider pulling these changes from the signed vfs-7.2-rc1.writeback tag.
Thanks!
Christian
----------------------------------------------------------------
Baokun Li (3):
writeback: fix race between cgroup_writeback_umount() and inode_switch_wbs()
writeback: drop now-unnecessary rcu_barrier() in cgroup_writeback_umount()
writeback: use a per-sb counter to drain inode wb switches at umount
Christian Brauner (2):
Merge patch series "writeback: fix race between cgroup_writeback_umount() and inode_switch_wbs()"
Merge patch series "mm: improve write performance with RWF_DONTCACHE"
Jeff Layton (3):
mm: preserve PG_dropbehind flag during folio split
mm: track DONTCACHE dirty pages per bdi_writeback
mm: kick writeback flusher for IOCB_DONTCACHE with targeted dirty tracking
fs/fs-writeback.c | 138 +++++++++++++++++++++++++++++++--------
include/linux/backing-dev-defs.h | 3 +
include/linux/fs.h | 6 +-
include/linux/fs/super_types.h | 8 +++
include/trace/events/writeback.h | 3 +-
mm/filemap.c | 15 ++++-
mm/huge_memory.c | 1 +
mm/page-writeback.c | 6 ++
8 files changed, 147 insertions(+), 33 deletions(-)