[PATCH] ocfs2: fix possible deadlock between unlink and dio_end_io_write

From: Joseph Qi

Date: Thu Mar 05 2026 - 22:22:52 EST


ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem,
while in ocfs2_dio_end_io_write, it acquires these locks in reverse
order. This creates an ABBA lock ordering violation on lock classes
ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and
ocfs2_file_ip_alloc_sem_key.

Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem):
ocfs2_unlink
ocfs2_prepare_orphan_dir
ocfs2_lookup_lock_orphan_dir
inode_lock(orphan_dir_inode) <- lock A
__ocfs2_prepare_orphan_dir
ocfs2_prepare_dir_for_insert
ocfs2_extend_dir
ocfs2_expand_inline_dir
down_write(&oi->ip_alloc_sem) <- Lock B

Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock):
ocfs2_dio_end_io_write
down_write(&oi->ip_alloc_sem) <- Lock B
ocfs2_del_inode_from_orphan()
inode_lock(orphan_dir_inode) <- Lock A

Deadlock Scenario:
CPU0 (unlink) CPU1 (dio_end_io_write)
------ ------
inode_lock(orphan_dir_inode)
down_write(ip_alloc_sem)
down_write(ip_alloc_sem)
inode_lock(orphan_dir_inode)

Since ip_alloc_sem is to protect allocation changes, which is unrelated
with operations in ocfs2_del_inode_from_orphan. So move
ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock.

Reported-by: syzbot+67b90111784a3eac8c04@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://syzkaller.appspot.com/bug?extid=67b90111784a3eac8c04
Fixes: a86a72a4a4e0 ("ocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write")
Signed-off-by: Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx>
---
fs/ocfs2/aops.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 17ba79f443ee..09146b43d1f0 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2294,8 +2294,6 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
goto out;
}

- down_write(&oi->ip_alloc_sem);
-
/* Delete orphan before acquire i_rwsem. */
if (dwc->dw_orphaned) {
BUG_ON(dwc->dw_writer_pid != task_pid_nr(current));
@@ -2308,6 +2306,7 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
mlog_errno(ret);
}

+ down_write(&oi->ip_alloc_sem);
di = (struct ocfs2_dinode *)di_bh->b_data;

ocfs2_init_dinode_extent_tree(&et, INODE_CACHE(inode), di_bh);
--
2.39.3