Re: [PATCH v8 00/17] gfs2: Fix mmap + page fault deadlocks
From: Theodore Ts'o
Date: Tue Oct 26 2021 - 01:13:52 EST
On Mon, Oct 25, 2021 at 08:24:26PM +0200, Andreas Gruenbacher wrote:
> > For generic_perform_write() Dave Hansen attempted to move the fault-in
> > after the uaccess in commit 998ef75ddb57 ("fs: do not prefault
> > sys_write() user buffer pages"). This was reverted as it was exposing an
> > ext4 bug. I don't [know] whether it was fixed but re-applying Dave's commit
> > avoids the performance drop.
>
> Interesting. The revert of commit 998ef75ddb57 is in commit
> 00a3d660cbac. Maybe Dave and Ted can tell us more about what went
> wrong in ext4 and whether it's still an issue.
The context for the revert can be found here[1].
[1] https://lore.kernel.org/lkml/20151005152236.GA8140@xxxxxxxxx/
And "what went wrong in ext4" was fixed here[2].
[2] https://lore.kernel.org/lkml/20151005152236.GA8140@xxxxxxxxx/
which landed upstream as commit b90197b65518 ("ext4: use private
version of page_zero_new_buffers() for data=journal mode").
So it looks like the original issue which triggered the revert in 2015
should be addressed, and we can easily test it by using generic/208
with data=journal mode.
There also seems to be a related discussion about whether we should
unrevert 998ef75ddb57 here[3]. Hmm. there is a mention on that thread
in [3], "Side note: search for "iov_iter_fault_in_writeable()" on lkml
for a gfs2 patch-series that is buggy, exactly because it does *not*
use the atomic user space accesses, and just tries to do the fault-in
to hide the real bug." I assume that's related to the discussion on
this thread?
[3] https://lore.kernel.org/all/3221175.1624375240@xxxxxxxxxxxxxxxxxxxxxx/T/#u
- Ted