Re: [PATCH] ocfs2: Fix circular locking dependency in ocfs2_del_inode_from_orphan()
From: Heming Zhao
Date: Tue Feb 24 2026 - 03:57:16 EST
On Mon, Feb 23, 2026 at 10:35:40AM +0530, Prithvi wrote:
> On Mon, Feb 16, 2026 at 09:47:08AM +0530, Prithvi wrote:
> > On Thu, Jan 15, 2026 at 04:47:56AM +0000, Matthew Wilcox wrote:
> > > On Thu, Jan 15, 2026 at 08:48:09AM +0530, Prithvi wrote:
> > > > On Fri, Jan 09, 2026 at 12:07:00AM +0530, Prithvi wrote:
> > > > > IIUC that would be the case when ocfs2_dio_end_io_write() operates on normal
> > > > > files. However, according to the report of the bug, ocfs2_dio_end_io_write()
> > > > > is holding &ocfs2_quota_ip_alloc_sem_key so I think due to random fuzzing by
> > > > > syzkaller, the function is curently operating on a quota file.
> > >
> > > So the right fix is probably to disallow DIO to a quota file, wouldn't
> > > you say?
> >
> > Hello all,
> >
> > Firstly I apologize for the delay...I was trying to understand the bug in
> > detail and few strange things I noticed. I have now confirmed the bug is
> > due to uninitialized locks during slab reuse. Here are the deatils:
> >
> > According to earlier observations, we knew that a file with a quota lock
> > was trying to enter DIO path which is not permitted. I observed that
> > that in spite of adding checks like IS_NOQUOTA(inode) since I observed:
> >
> > else if (fe->i_flags & cpu_to_le32(OCFS2_QUOTA_FL)) {
> > inode->i_flags |= S_NOQUOTA;
> > }
> >
> > in ocfs2_populate_inode(). However with this change also the issue didn't
> > seem to get resolved.
> >
> > I found that it was some inode numbered 16979 which was entering DIO.
> > However, upon investigation I noticed that inode 16979 passed through
> > ocfs2_populate_inode() only when create_ino = 1. Upon adding a few printk
> > statements in __ocfs2_mknod_locked() I found that the inode 16979 strangely
> > held ocfs2_quota_ip_allc_sem lock at the mknod stage. I suspected it to be
> > a lock cleanup issue in the slab object when inodes are returned to the slab.
> >
> > I found that in one mount session, an inode was mounted as a quota file,
> > with a very small number (mostly <50) then later on in another mount session
> > that same slab object is reused for the inode with number 16979 which
> > immediately causes the lockdep warning to occur since it still holds the
> > quota_ip_alloc_sem lock.
> >
> > IIUC, I did not find any way the semaphore is cleaned when an inode is
> > evicted. Hence I think to add init_rwsem() to the ocfs2_clear_inode() to
> > stop the warning, since now that inode held ip_alloc_sem lock only, thus
> > no quota lock acquired during DIO and the issue never occurs, even if same
> > slab object is used which retains the lock identity. In addition, a check
> > can be added to block DIO on quota files.
> >
> > Thus, I think the following change should work:
> >
> > diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> > index 76c86f1c2b1c..372c198afa25 100644
> > --- a/fs/ocfs2/aops.c
> > +++ b/fs/ocfs2/aops.c
> > @@ -2420,7 +2420,13 @@ static ssize_t ocfs2_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
> > struct inode *inode = file->f_mapping->host;
> > struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> > get_block_t *get_block;
> > -
> > +
> > + if(IS_NOQUOTA(inode)) {
> > + mlog(ML_ERROR, "Direct IO is not allowed for Quota inode %lu\n",
> > + inode->i_ino);
> > + return -ENOTSUP;
> > + }
> > +
> > /*
> > * Fallback to buffered I/O if we see an inode without
> > * extents.
> > diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> > index b5fcc2725a29..3e36da2ca5d0 100644
> > --- a/fs/ocfs2/inode.c
> > +++ b/fs/ocfs2/inode.c
> > @@ -1268,7 +1268,8 @@ static void ocfs2_clear_inode(struct inode *inode)
> > "Clear inode of %llu, alloc_sem is locked\n",
> > (unsigned long long)oi->ip_blkno);
> > up_write(&oi->ip_alloc_sem);
> > -
> > + init_rwsem(&oi->ip_alloc_sem);
> > +
> > mlog_bug_on_msg(oi->ip_open_count,
> > "Clear inode of %llu has open count %d\n",
> > (unsigned long long)oi->ip_blkno, oi->ip_open_count);
> >
> > What do you think?
> >
> > Thank you and best regards,
> > Prithvi
>
> Hello everyone,
>
> Just a gentle ping on this thread...I wanted to seek feedback regarding the
> analysis of uninitialized locks during slab reuse. If the proposed fix is
> correct, I can go ahead and send a v2 patch for the same.
>
> In addition, I also realised using -EPERM in the if condition in
> ocfs2_direct_IO() might be more relevant so I tested a patch with that change
> and it doesn't trigger any issue with the testing with reproducer program as
> tested here :
> https://lore.kernel.org/all/66fdfef3.050a0220.9ec68.0031.GAE@xxxxxxxxxx/T/#m00f39888c9cfdb78b2c5eff8d1e2cba82e4fd9cc
>
> Thanks,
> Prithvi
Hi Prithvi,
This patch disables DIO on quota files, which will lead to a performance
regression. Other filesystems (such as ext4 and gfs2) do not forbid DIO on quota
files. Therefore, I NAK this patch.
I have sent an alternative patch[1] to address this issue. Please take a look.
[1]:
https://lore.kernel.org/ocfs2-devel/20260224084909.28361-1-heming.zhao@xxxxxxxx/T/#u
Thanks,
Heming