Re: [RFC] Badness in __mutex_unlock_slowpath with XFS stress tests

From: Nathan Scott
Date: Mon Jul 10 2006 - 20:17:19 EST


On Mon, Jul 10, 2006 at 12:46:23PM -0400, Stephane Doyon wrote:
> Hi,

Hi Stephane,

> A few months back, a fix was introduced for a mutex double unlock warning
> related to direct I/O in XFS. I believe that fix has a lock ordering
> problem that can cause a deadlock.
>
> The problem was that __blockdev_direct_IO() would unlock the i_mutex in
> the READ and DIO_OWN_LOCKING case, and the mutex would be unlocked again
> in xfs_read() soon after returning from __generic_file_aio_read(). Because
> there are lots of execution paths down from __generic_file_aio_read() that
> do not consistently release the i_mutex, the safest fix was deemed to be
> to reacquire the i_mutex before returning from __blockdev_direct_IO(). At
> this point however, the reader is holding an xfs ilock, and AFAICT the
> i_mutex is usually taken BEFORE the xfs ilock.

That is correct, yes. Hmm. Subtle. Painful. Thanks for the detailed
report and your analysis.

> We are seeing a deadlock between a process writing and another process
> reading the same file, when the reader is using direct I/O. (The writer

Is that a direct writer or a buffered writer?

> must apparently be growing the file, using either direct or buffered
> I/O.) Something like this, on XFS:
> (dd if=/dev/zero of=f bs=128K count=5000 & ) ; sleep 1 ; dd of=/dev/null
> if=f iflag=direct bs=128K count=5000
>
> Seen on kernels 2.6.16 and 2.6.17.
>
> The deadlock scenario appears to be this:
> -The reader goes into xfs_read(), takes the i_mutex and then an xfs_ilock
> XFS_IOLOCK_SHARED, then calls down to __generic_file_aio_read() which
> eventually goes down to __blockdev_direct_IO(). In there it drops the
> i_mutex.
> -The writer goes into xfs_write() and obtains the i_mutex. It then tries
> to get an xfs_ilock XFS_ILOCK_EXCL and must wait on it since it's held by
> the reader.
> -The reader, still in __blockdev_direct_IO(), executes directio_worker()
> and then tries to reacquire the i_mutex, and must wait on it because the
> writer has it.
>
> And so we have a deadlock.

*nod*. This will require some thought, I'm not sure I like the sound of
your suggested workaround there a whole lot, unfortunately, but maybe it
is all we can do at this stage. Let me ponder further and get back to you
(but if you want to prototype your fix further, that'd be most welcome, of
course).

cheers.

--
Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/