[PATCH v9 0/4] ext4: deferred iput framework for EA inodes

From: Yun Zhou

Date: Tue Jun 23 2026 - 04:54:52 EST


This series introduces a deferred-iput framework for EA inodes to
eliminate a class of lock ordering issues in ext4 xattr code.

The problem: iput() on EA inodes while holding xattr_sem or a jbd2
handle can trigger eviction, which may acquire those same locks or
s_writepages_rwsem, creating circular dependencies. The immediate
deadlock (during mount-time orphan cleanup) is fixed by two separate
patches already reviewed and posted:

ext4: skip extra isize expansion during mount to prevent deadlock
ext4: set EXT4_STATE_NO_EXPAND in ext4_evict_inode

This series provides the structural fix that makes the code safe
regardless of calling context:

Patch 1 adds a VFS helper iput_if_not_last() which drops an inode
reference only if it is not the last one, using atomic_add_unless().
This provides a proper VFS abstraction for filesystems that need to
conditionally defer final iput.

Patch 2 introduces ext4_put_ea_inode() using iput_if_not_last() as
a fast path (single atomic, zero overhead for the common case). If
this is the last reference, the inode is linked onto a per-sb llist
(via i_ea_iput_node embedded in ext4_inode_info, union with xattr_sem
which is unused for EA inodes) and a delayed worker (1 jiffie) performs
the final iput() in a clean context. No per-iput allocation needed.
Also moves init_rwsem(xattr_sem) from init_once to ext4_alloc_inode
to handle slab reuse after the union field has been overwritten.

Patch 3 converts all EA inode iput() calls in xattr code to use
ext4_put_ea_inode() uniformly -- no exceptions to reason about.

Patch 4 removes the now-redundant ea_inode_array mechanism (parameter
threading, struct, expand/free functions), replaced entirely by direct
ext4_put_ea_inode() calls. This is a net code reduction.

Link: https://syzkaller.appspot.com/bug?extid=5d19358d7eb30ffb0cc5

v9:
- Add iput_if_not_last() as proper VFS helper (per reviewer: don't
let filesystems manipulate inode refcount without VFS abstraction).
- Use iput_if_not_last() + llist_node embedded in ext4_inode_info
(union with xattr_sem) to avoid per-iput allocation entirely.
- Convert ALL EA inode iput() calls uniformly -- no exceptions.
- Remove entire ea_inode_array mechanism.
- Add WARN_ON_ONCE in ext4_put_ea_inode() to catch misuse on non-EA
inodes (protects the xattr_sem union safety).
- Fix worker re-arm: ext4_drain_ea_inode_work() loops to handle
nested EA inode evictions re-scheduling work.
- Move INIT_DELAYED_WORK before journal loading (fast commit replay
may trigger evictions).
- Drain before ext4_quotas_off() for correct quota accounting.
- Add flush in failed_mount_wq and failed_mount3a error paths for
journal replay case.
- Move init_rwsem(xattr_sem) from init_once to ext4_alloc_inode to
handle slab object reuse after union overwrite.
- Encapsulate worker init into ext4_init_ea_inode_work(), making
ext4_ea_inode_work() static to xattr.c.

(Resending with VFS maintainers Cc'd -- no code changes from initial posting)

Yun Zhou (4):
fs: add iput_if_not_last() helper
ext4: introduce ext4_put_ea_inode() for safe deferred iput
ext4: convert all EA inode iput() calls to ext4_put_ea_inode()
ext4: remove ea_inode_array mechanism in favor of ext4_put_ea_inode()

fs/ext4/ext4.h | 13 ++++-
fs/ext4/inode.c | 6 +-
fs/ext4/super.c | 18 +++++-
fs/ext4/xattr.c | 150 ++++++++++++++++++++-------------------------
fs/ext4/xattr.h | 21 ++++---
include/linux/fs.h | 13 ++++
6 files changed, 125 insertions(+), 96 deletions(-)

--
2.43.0