Re: [PATCH v5] ext4: fix ABBA deadlock in ext4_xattr_inode_cache_find()
From: Jan Kara
Date: Thu Jun 25 2026 - 06:24:01 EST
On Thu 25-06-26 06:50:09, Aditya Srivastava wrote:
> From: Aditya Prakash Srivastava <aditya.ansh182@xxxxxxxxx>
>
> Syzbot/stress-ng reported an ABBA deadlock in ext4 when exercising
> concurrent xattr workloads (using the ea_inode mount/format option).
>
> The deadlock occurs between the running transaction and the eviction
> thread:
> - Task 1 (stress-ng): Holds a reference to a shared mbcache_entry (ce)
> and calls ext4_xattr_inode_cache_find() -> ext4_iget() to retrieve
> the corresponding EA inode. Since the EA inode is currently being
> evicted, ext4_iget() blocks in __wait_on_freeing_inode() waiting for
> eviction to complete.
> - Task 2 (eviction thread): Currently evicting the same EA inode in
> ext4_evict_ea_inode(). It calls mb_cache_entry_wait_unused(oe) which
> blocks waiting for Task 1 to release the reference to the mbcache_entry.
>
> To break this deadlock, implement a new ext4_iget() configuration flag
> named EXT4_IGET_NOWAIT. When set, perform a non-blocking lookup of the
> inode via VFS's find_inode_nowait() API.
>
> If the inode is currently being evicted (marked with I_FREEING or
> I_WILL_FREE) or created (I_CREATING), or if it is not present in the VFS
> inode cache (cache miss), simply skip it (returning -ESTALE) rather than
> waiting for eviction/creation to complete, breaking the ABBA cycle.
>
> Since we return -ESTALE immediately on a cache miss, we never attempt to
> allocate a new inode or call iget_locked(), completely eliminating any
> TOCTOU race window.
>
> If the returned inode is I_NEW, wait for its initialization to clear via
> wait_on_new_inode(). If initialization fails and the inode is unhashed
> during wait_on_new_inode() waking up (e.g., due to an I/O read error in
> another thread), safely drop the reference and return -ESTALE. This
> unhashed check is executed unconditionally on all cache-hit pathways to
> properly handle concurrent initialization failures.
>
> Finally, standard validation checks (including is_bad_inode,
> EXT4_EA_INODE_FL, file_acl, and xattr flags) are executed as normal inside
> check_igot_inode() to fully guarantee VFS-layer safety.
>
> In ext4_xattr_inode_cache_find(), invoke ext4_iget() with the new
> EXT4_IGET_NOWAIT flag to perform the non-blocking cache search.
>
> Suggested-by: Jan Kara <jack@xxxxxxx>
> Reported-by: Colin Ian King <colin.i.king@xxxxxxxxx>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219283
> Fixes: 0a46ef234756 ("ext4: do not create EA inode under buffer lock")
> Signed-off-by: Aditya Prakash Srivastava <aditya.ansh182@xxxxxxxxx>
Three patch submissions in a day? I think you are overdoing it. Please test
more, think more about what you do (i.e., really understand what the code
does, not just whatever some AI agent suggested) and submit less.
> +static int ext4_iget_match(struct inode *inode, u64 ino, void *data)
> +{
> + bool *is_freeing = data;
> +
> + if (inode->i_ino != ino)
> + return 0;
> + spin_lock(&inode->i_lock);
> + if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE | I_CREATING)) {
> + if (is_freeing)
> + *is_freeing = true;
> + spin_unlock(&inode->i_lock);
> + return -1;
> + }
> + __iget(inode);
> + spin_unlock(&inode->i_lock);
> + return 1;
> +}
> +
> struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
> ext4_iget_flags flags, const char *function,
> unsigned int line)
> @@ -5298,9 +5316,26 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
> return ERR_PTR(-EFSCORRUPTED);
> }
>
> - inode = iget_locked(sb, ino);
> - if (!inode)
> - return ERR_PTR(-ENOMEM);
> + if (flags & EXT4_IGET_NOWAIT) {
> + bool is_freeing = false;
> +
> + inode = find_inode_nowait(sb, ino, ext4_iget_match, &is_freeing);
> + if (is_freeing || !inode)
is_freeing is pointless. Just drop it and use !inode.
> + return ERR_PTR(-ESTALE);
I'd prefer -ENOENT here instead of -ESTALE. It's kind of more specific.
> + if (inode_state_read_once(inode) & I_NEW)
> + wait_on_new_inode(inode);
> +
> + if (unlikely(inode_unhashed(inode))) {
> + iput(inode);
> + return ERR_PTR(-ESTALE);
Here -ENOENT as well please.
Otherwise the patch looks good to me.
Honza
> + }
> + } else {
> + inode = iget_locked(sb, ino);
> + if (!inode)
> + return ERR_PTR(-ENOMEM);
> + }
> +
> if (!(inode_state_read_once(inode) & I_NEW)) {
> ret = check_igot_inode(inode, flags, function, line);
> if (ret) {
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index 982a1f831e22..21b5670d8503 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -1550,7 +1550,7 @@ ext4_xattr_inode_cache_find(struct inode *inode, const void *value,
>
> while (ce) {
> ea_inode = ext4_iget(inode->i_sb, ce->e_value,
> - EXT4_IGET_EA_INODE);
> + EXT4_IGET_EA_INODE | EXT4_IGET_NOWAIT);
> if (IS_ERR(ea_inode))
> goto next_entry;
> ext4_xattr_inode_set_class(ea_inode);
> --
> 2.47.3
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR