[PATCH] ext4: move inline data cleanup to ext4_writepages to fix deadlock

From: Yun Zhou

Date: Tue Jun 09 2026 - 11:55:41 EST


ext4_do_writepages() calls ext4_destroy_inline_data() which acquires
xattr_sem while s_writepages_rwsem is held (read). This creates a
circular lock dependency with the xattr writeback path:

CPU0 CPU1
---- ----
ext4_writepages()
ext4_writepages_down_read()
[holds s_writepages_rwsem]
ext4_evict_inode()
__ext4_mark_inode_dirty()
ext4_expand_extra_isize_ea()
ext4_xattr_block_set()
[holds xattr_sem]
iput(old_bh inode)
write_inode_now()
ext4_writepages()
ext4_writepages_down_read()
[BLOCKED on s_writepages_rwsem]
ext4_do_writepages()
ext4_destroy_inline_data()
down_write(xattr_sem)
[BLOCKED on xattr_sem]

Move inline data destruction from ext4_do_writepages() into
ext4_writepages(), before acquiring s_writepages_rwsem.

This is safe because the other caller of ext4_do_writepages()
(ext4_normal_submit_inode_data_buffers, invoked by jbd2 during commit)
can never encounter inline data: jbd2 only tracks inodes with
block-mapped dirty ranges registered via ext4_jbd2_inode_add_write(),
and all such registration paths either explicitly bail out when inline
data is present (ext4_journalled_write_end) or are logically
unreachable for inline data inodes (ext4_map_blocks requires block
allocation, ext4_block_zero_eof requires existing blocks).

Reported-by: syzbot+bb2455d02bda0b5701e3@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://syzkaller.appspot.com/bug?extid=bb2455d02bda0b5701e3
Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal mode and ext4_writepages")
Signed-off-by: Yun Zhou <yun.zhou@xxxxxxxxxxxxx>
---
fs/ext4/inode.c | 47 +++++++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 18 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c2c2d6ac7f3d..0c7461ab4fd0 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2810,24 +2810,6 @@ static int ext4_do_writepages(struct mpage_da_data *mpd)
if (unlikely(ret))
goto out_writepages;

- /*
- * If we have inline data and arrive here, it means that
- * we will soon create the block for the 1st page, so
- * we'd better clear the inline data here.
- */
- if (ext4_has_inline_data(inode)) {
- /* Just inode will be modified... */
- handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
- if (IS_ERR(handle)) {
- ret = PTR_ERR(handle);
- goto out_writepages;
- }
- BUG_ON(ext4_test_inode_state(inode,
- EXT4_STATE_MAY_INLINE_DATA));
- ext4_destroy_inline_data(handle, inode);
- ext4_journal_stop(handle);
- }
-
/*
* data=journal mode does not do delalloc so we just need to writeout /
* journal already mapped buffers. On the other hand we need to commit
@@ -3038,6 +3020,35 @@ static int ext4_writepages(struct address_space *mapping,
if (unlikely(ret))
return ret;

+ /*
+ * Clearing inline data acquires xattr_sem, which ranks above
+ * s_writepages_rwsem. Do it here before taking the rwsem to avoid
+ * a circular dependency:
+ * ext4_writepages (s_writepages_rwsem) -> ext4_destroy_inline_data
+ * (xattr_sem)
+ * ext4_xattr_block_set (xattr_sem) -> iput -> ext4_writepages
+ * (s_writepages_rwsem)
+ *
+ * This is only needed in the ext4_writepages() path. The other
+ * caller of ext4_do_writepages() -- ext4_normal_submit_inode_data_buffers
+ * (jbd2 commit callback) -- cannot encounter inline data because jbd2
+ * only tracks inodes with block-mapped dirty ranges registered via
+ * ext4_jbd2_inode_add_write(), and all such callers either bail out
+ * for inline data inodes (e.g. ext4_journalled_write_end) or are
+ * unreachable for them (ext4_map_blocks, ext4_block_zero_eof).
+ */
+ if (ext4_has_inline_data(mapping->host)) {
+ handle_t *handle;
+
+ handle = ext4_journal_start(mapping->host, EXT4_HT_INODE, 1);
+ if (IS_ERR(handle))
+ return PTR_ERR(handle);
+ BUG_ON(ext4_test_inode_state(mapping->host,
+ EXT4_STATE_MAY_INLINE_DATA));
+ ext4_destroy_inline_data(handle, mapping->host);
+ ext4_journal_stop(handle);
+ }
+
alloc_ctx = ext4_writepages_down_read(sb);
ret = ext4_do_writepages(&mpd);
/*
--
2.43.0