Re: [PATCH v5 0/2] ext4: Improve parallel I/O performance on NVDIMM

From: Waiman Long
Date: Mon May 02 2016 - 13:45:32 EST


On 05/01/2016 01:28 PM, Christoph Hellwig wrote:
On Fri, Apr 29, 2016 at 12:38:20PM -0400, Waiman Long wrote:
From my testing, it looked like that parallel overwrites to the same file in
an ext4 filesystem on DAX can happen in parallel even if their range
overlaps. It was mainly because the code will drop the i_mutex before the
write. That means the overlapped blocks can get garbage. I think this is a
problem, but I am not expert in the ext4 filesystem to say for sure. I would
like to know your thought on that.
That's another issue with dax I/O pretending to be direct I/O.. Because
it isn't we'll need to synchronize it like buffered I/O and not like
direct I/O in all file systems.

From what I saw in the code, I think filemap_write_and_wait_range()
should have prevented concurrent overwrites from stepping on each
other for non-DAX I/O. However it is essentially a no-op for DAX
I/O and so the protection is gone.

I am planning to send out a patch to disable mutex dropping for DAX
overwrite. There is still an issue on the read side. If journal is
disabled and the dioread_nolock mount option is used, read will done
without locking. Again, the filemap_write_and_wait_range() check on
the read side will not protect against write.

Cheers,
Longman