Re: [PATCH v3 1/2] ext4: Pass in DIO_SKIP_DIO_COUNT flag if inode_dio_begin() called

From: Waiman Long
Date: Wed Apr 20 2016 - 11:59:53 EST

On 04/19/2016 07:01 PM, Dave Chinner wrote:
On Mon, Apr 18, 2016 at 03:46:46PM -0400, Waiman Long wrote:
On 04/15/2016 06:19 PM, Dave Chinner wrote:
On Fri, Apr 15, 2016 at 01:17:41PM -0400, Waiman Long wrote:
On 04/15/2016 04:17 AM, Dave Chinner wrote:
On Thu, Apr 14, 2016 at 12:21:13PM -0400, Waiman Long wrote:
What the patch does is to eliminate the innermost
inode_dio_begin/end pair.
Yes, and with that change inode_dio_wait() no longer waits for
AIO+DIO writes on ext4, hence breaking truncate IO barrier
requirements of inode_dio_wait().


You are right and thank for pointing this out to me. I think I focus too
much on the dax_do_io() internal and didn't realize that inode_dio_end() can
be deferred in __blockdev_direct_IO(). I will update my patch to eliminate
the extra inode_dio_begin/end pair only for dax_do_io().
Even there there is the risk that a future change will break ext4.
the ext4 code needs fixing first, then you can look at skipping the
DIO based counting everywhere.

i.e. fix the root cause of the problem, don't hack around it or
throw band-aids over it.
I agree that the ext4 code needs fixing w.r.t. the problem that you
found. That will take more time and testing. In the mean time, I
think it is OK to pick the low-hanging fruits that are handled by my
IOWs, you're saying that you won't fix the problem, because all you
care about is scalability results. This is how we end up with code
that breaks randomly in future because if it doesn't get fixed now,
nobody will fix the underlying problem. So, fix it now, fix it
properly and you still get your scalability improvement without
leaving a landmine that will explode on someone else in future.

Fix it now, fix it properly.

I am not saying that I will not fix it. I am just saying that I need more time to fully understand what code changes need to be done. I am not that well versed in the filesystem internal, though it will be a good learning experience for me.