[PATCH v2] ext4: drop s_writepages_rwsem around ext4_destroy_inline_data

From: Yun Zhou

Date: Wed Jun 10 2026 - 01:10:30 EST


ext4_do_writepages() calls ext4_destroy_inline_data() which acquires
xattr_sem while s_writepages_rwsem is held (read). This creates a
circular lock dependency:

CPU0 CPU1
---- ----
ext4_writepages()
ext4_writepages_down_read()
[holds s_writepages_rwsem]
ext4_evict_inode()
__ext4_mark_inode_dirty()
ext4_expand_extra_isize_ea()
ext4_xattr_block_set()
[holds xattr_sem]
iput(old_bh inode)
write_inode_now()
ext4_writepages()
ext4_writepages_down_read()
[BLOCKED on s_writepages_rwsem]
ext4_do_writepages()
ext4_destroy_inline_data()
down_write(xattr_sem)
[BLOCKED on xattr_sem]

Fix by temporarily dropping s_writepages_rwsem around the call to
ext4_destroy_inline_data(). This is safe because:

- This code runs before any block mapping or IO submission, so no
writepages state depends on the rwsem being held at this point.

- Inline data destruction is a one-way format transition (once cleared,
EXT4_INODE_INLINE_DATA is never set again). The rwsem is
re-acquired immediately after, ensuring format stability for the
remainder of writepages.

- The can_map flag naturally identifies the ext4_writepages() path
(holds rwsem) vs ext4_normal_submit_inode_data_buffers() (does not),
so the drop/reacquire is skipped when the rwsem is not held.

Also check the return value of ext4_destroy_inline_data() -- previously
ignored, a failure would leave inline data intact while writepages
proceeds assuming block-mapped layout.

Reported-by: syzbot+bb2455d02bda0b5701e3@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://syzkaller.appspot.com/bug?extid=bb2455d02bda0b5701e3
Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal mode and ext4_writepages")
Signed-off-by: Yun Zhou <yun.zhou@xxxxxxxxxxxxx>
---
v2:
- Instead of moving inline data handling to ext4_writepages(),
temporarily drop s_writepages_rwsem around ext4_destroy_inline_data()
in ext4_do_writepages(). The move approach had a race where concurrent
writes could create dirty pages with inline data after the early check,
and unconditional destruction without dirty pages would lose data.

fs/ext4/inode.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c2c2d6ac7f3d..7ec16adf4685 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1694,6 +1694,9 @@ struct mpage_da_data {
struct writeback_control *wbc;
unsigned int can_map:1; /* Can writepages call map blocks? */

+ /* Saved memalloc context from ext4_writepages_down_read() */
+ int alloc_ctx;
+
/* These are internal state of ext4_do_writepages() */
loff_t start_pos; /* The start pos to write */
loff_t next_pos; /* Current pos to examine */
@@ -2824,8 +2827,21 @@ static int ext4_do_writepages(struct mpage_da_data *mpd)
}
BUG_ON(ext4_test_inode_state(inode,
EXT4_STATE_MAY_INLINE_DATA));
- ext4_destroy_inline_data(handle, inode);
+ /*
+ * Temporarily drop s_writepages_rwsem because
+ * ext4_destroy_inline_data() acquires xattr_sem, which has
+ * a higher lock ordering rank. Holding both would create a
+ * circular dependency with ext4_xattr_block_set() -> iput()
+ * -> ext4_writepages() -> s_writepages_rwsem.
+ */
+ if (mpd->can_map)
+ ext4_writepages_up_read(inode->i_sb, mpd->alloc_ctx);
+ ret = ext4_destroy_inline_data(handle, inode);
+ if (mpd->can_map)
+ mpd->alloc_ctx = ext4_writepages_down_read(inode->i_sb);
ext4_journal_stop(handle);
+ if (ret)
+ goto out_writepages;
}

/*
@@ -3032,13 +3048,12 @@ static int ext4_writepages(struct address_space *mapping,
.can_map = 1,
};
int ret;
- int alloc_ctx;

ret = ext4_emergency_state(sb);
if (unlikely(ret))
return ret;

- alloc_ctx = ext4_writepages_down_read(sb);
+ mpd.alloc_ctx = ext4_writepages_down_read(sb);
ret = ext4_do_writepages(&mpd);
/*
* For data=journal writeback we could have come across pages marked
@@ -3047,7 +3062,7 @@ static int ext4_writepages(struct address_space *mapping,
*/
if (!ret && mpd.journalled_more_data)
ret = ext4_do_writepages(&mpd);
- ext4_writepages_up_read(sb, alloc_ctx);
+ ext4_writepages_up_read(sb, mpd.alloc_ctx);

return ret;
}
--
2.43.0