Re: [PATCH] xfs: using mutex instead of semaphore for xfs_buf_lock()

From: Dave Chinner
Date: Wed Jan 15 2025 - 15:54:37 EST


On Wed, Jan 15, 2025 at 08:05:21PM +0800, Jinliang Zheng wrote:
> On Wed, 15 Jan 2025 11:28:54 +1100, Dave Chinner wrote:
> > On Fri, Dec 20, 2024 at 01:16:29AM +0800, Jinliang Zheng wrote:
> > > xfs_buf uses a semaphore for mutual exclusion, and its count value
> > > is initialized to 1, which is equivalent to a mutex.
> > >
> > > However, mutex->owner can provide more information when analyzing
> > > vmcore, making it easier for us to identify which task currently
> > > holds the lock.
> >
> > However, the buffer lock also protects the buffer state and contents
> > whilst IO id being performed and it *is not owned by any task*.
> >
> > A single lock cycle for a buffer can pass through multiple tasks
> > before being unlocked in a different task to that which locked it:
> >
> > p0 <intr> <kworker>
> > xfs_buf_lock()
> > ...
> > <submitted for async io>
> > <wait for IO completion>
> > .....
> > <io completion>
> > queued to workqueue
> > .....
> > perform IO completion
> > xfs_buf_unlock()
> >
> >
> > IOWs, the buffer lock here prevents any other task from accessing
> > and modifying the contents/state of the buffer until the IO in
> > flight is completed. i.e. the buffer contents are guaranteed to be
> > stable during write IO, and unreadable when uninitialised during
> > read IO....
>
> Yes.
>
> >
> > i.e. the locking model used by xfs_buf objects is incompatible with
> > the single-owner-task critical section model implemented by
> > mutexes...
> >
>
> Yes, from a model perspective.
>
> This patch is proposed for two reasons:
> 1. The maximum count of the xfs_buf->b_sema is 1, which means that only one
> kernel code path can hold it at the same time. From this perspective,
> changing it to mutex will not have any functional impact.
> 2. When troubleshooting the hungtask of xfs, sometimes it is necessary to
> locate who has acquired the lock. Although, as you said, xfs_buf->b_sema
> will flow to other kernel code paths after being down(), it is also helpful
> to know which kernel code path locked it first.
>
> Haha, that's just my thought. If you think there is really no need to know who
> called the down() on xfs_buf->b_sema, please just ignore this patch.

We are rejecting the patch because it's fundamentally broken, not
because we don't want debugging visibility.

If you want to track what task locked a semaphore, then that should
be added to the semaphore implementation. Changing the XFS locking
implementation is not the solution to the problem you are trying to
solve.

-Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx