Re: [PATCH] ext4: skip extra isize expansion on inode eviction to avoid deadlock
From: Jan Kara
Date: Thu Jun 11 2026 - 10:04:38 EST
On Thu 11-06-26 20:45:55, Yun Zhou wrote:
> Expanding extra isize on an inode that is being evicted is pointless
> since the inode is about to be deleted. Skip it by setting
> EXT4_STATE_NO_EXPAND before calling ext4_mark_inode_dirty() in the
> eviction path.
>
> This also breaks a circular lock dependency reported by lockdep during
> orphan cleanup at mount time:
>
> CPU0 (writeback worker) CPU1 (open)
> ---- ----
> ext4_writepages()
> s_writepages_rwsem (read) ext4_create()
> ext4_do_writepages() __ext4_new_inode()
> ext4_journal_start() [holds jbd2 handle]
> wait_transaction_locked() ext4_xattr_set_handle()
> [WAIT for jbd2_handle] xattr_sem (write)
>
> CPU2 (mount / orphan cleanup)
> ----
> ext4_evict_inode()
> __ext4_mark_inode_dirty()
> ext4_try_to_expand_extra_isize()
> xattr_sem (write)
> ext4_expand_extra_isize_ea()
> ext4_xattr_block_set()
> iput(ea_inode)
> write_inode_now()
> ext4_writepages()
> s_writepages_rwsem (read)
> [WAIT for s_writepages_rwsem -- if blocked by write lock holder]
>
> This forms a circular dependency on lock classes:
>
> s_writepages_rwsem --> jbd2_handle --> xattr_sem --> s_writepages_rwsem
>
> The iput() inside ext4_xattr_block_set() triggers write_inode_now()
> because SB_ACTIVE is not yet set during mount, so iput_final() cannot
> cache the inode in the LRU and must flush it synchronously.
>
> Setting EXT4_STATE_NO_EXPAND prevents ext4_try_to_expand_extra_isize()
> from executing, which eliminates the xattr_sem --> s_writepages_rwsem
> edge and breaks the cycle.
>
> Reported-by: syzbot+5d19358d7eb30ffb0cc5@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=5d19358d7eb30ffb0cc5
> Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal mode and ext4_writepages")
> Signed-off-by: Yun Zhou <yun.zhou@xxxxxxxxxxxxx>
Thanks for the patch! So I have no problem with setting EXT4_STATE_NO_EXPAND
in ext4_evict_inode() as you correctly point out expansion is pointless in
that case. But your patch actually doesn't fix the real problem, it only
deals with the particular syzbot reproducer. The real problem is that
ext4_xattr_block_set() which is run inside a transaction can end up
acquiring s_writepages_rwsem which violates the lock ordering rules. So
this is the problem that really needs to be fixed.
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR