[RFC PATCH 0/3] hugetlbfs: address fault time regression

From: Mike Kravetz
Date: Mon Jul 06 2020 - 16:27:41 EST


Commits c0d0381ade79 and 87bf91d39bb5 changed the way huegtlb locking
was performed to address BUGs. One specific change was to always take
the i_mmap_rwsem in read mode during fault processing. One result of
this change was a 33% regression for anon non-shared page faults [1].

Technically, i_mmap_rwsem only needs to be taken during page faults
if the pmd can potentially be shared. pmd sharing is not possible for
anon non-shared mappings (as in the reported regression), therefore the
code can be modified to not acquire the semaphore in this case.

Unfortunately, commit 87bf91d39bb5 depends on i_mmap_rwsem always being
taken in the fault path to prevent fault/truncation races. So, that
approach is no longer appropriate. Rather, the code now detects races
and backs out operations.

This code "works" in that it only takes i_mmap_rwsem when necessary and
addresses the original BUGs. However, I am sending as an RFC because:
- I am unsure if the added complexity is worth performance benefit.
- There needs to be a better way/location to make a decison about taking
the semaphore. See FIXME's in the code.

Comments and suggestions would be appreciated.

[1] https://lore.kernel.org/lkml/20200622005551.GK5535@shao2-debian

Mike Kravetz (3):
Revert: "hugetlbfs: Use i_mmap_rwsem to address page fault/truncate
race"
hugetlbfs: Only take i_mmap_rwsem when sharing is possible
huegtlbfs: handle page fault/truncate races

fs/hugetlbfs/inode.c | 69 +++++++++-----------
mm/hugetlb.c | 150 ++++++++++++++++++++++++++++++-------------
2 files changed, 137 insertions(+), 82 deletions(-)

--
2.25.4