Re: [PATCH mm-unstable v1] mm/hugetlb_vmemmap: fix memory loads ordering

From: David Hildenbrand
Date: Fri Jan 10 2025 - 14:17:18 EST


On 10.01.25 20:04, David Hildenbrand wrote:
On 07.01.25 18:02, David Hildenbrand wrote:
On 07.01.25 17:35, Matthew Wilcox wrote:
On Tue, Jan 07, 2025 at 09:49:18AM +0100, David Hildenbrand wrote:
+++ b/include/linux/page-flags.h
@@ -212,7 +212,7 @@ static __always_inline const struct page *page_fixed_fake_head(const struct page
* cold cacheline in some cases.
*/
if (IS_ALIGNED((unsigned long)page, PAGE_SIZE) &&
- test_bit(PG_head, &page->flags)) {
+ test_bit_acquire(PG_head, &page->flags)) {

This change will affect all page_fixed_fake_head() users, like ordinary
PageTail even on !hugetlb.

I've been looking at the callers of PageTail() because it's going to
be a bit of a weird thing to be checking in the separate-page-and-folio
world. Obviously we can implement it, but there's a bit of a "But why
would you want to ask that question" question.

Most current occurrences of PageTail() are in assertions of one form or
another. Fair enough, not performance critical.

make_device_exclusive_range() is a little weird; looks like it's trying
to make sure that each folio is only made exclusive once, and ignore any
partial folios which overlap the start of the area.

I could have sworn we only support small folios here, but looks like
we do support large folios.

IIUC, there is no way to identify reliably "this folio is device exclusive",
the only hint is "no mappings". The following might do:

diff --git a/mm/rmap.c b/mm/rmap.c
index c6c4d4ea29a7e..1424d0a351a86 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2543,7 +2543,13 @@ int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
for (i = 0; i < npages; i++, start += PAGE_SIZE) {
struct folio *folio = page_folio(pages[i]);
- if (PageTail(pages[i]) || !folio_trylock(folio)) {
+
+ /*
+ * If there are no mappings, either the folio is actually
+ * unmapped or only device-exclusive swap entries point at
+ * this folio.
+ */
+ if (!folio_mapped(folio) || !folio_trylock(folio)) {
folio_put(folio);
pages[i] = NULL;
continue;

I stared longer at this, and not sure if that will work.

The PageTail() is in place because we return with the folio locked on
success, so we won't trylock again on tail pages.

But staring at page_make_device_exclusive_one(), I am not sure if it
does what we want in all cases ...

... and the hmm selftests just keeps failing upstream as well?! huh. :)

I'll try spending some time on this to see if I can grasp what needs to
be done and how it could be handled ... better.


As expected ...

# echo never > /sys/kernel/mm/transparent_hugepage/enabled
# ./hmm-tests
...
# RUN hmm.hmm_device_private.exclusive ...
# OK hmm.hmm_device_private.exclusive
ok 21 hmm.hmm_device_private.exclusive
# RUN hmm.hmm_device_private.exclusive_mprotect ...
# OK hmm.hmm_device_private.exclusive_mprotect
ok 22 hmm.hmm_device_private.exclusive_mprotect
# RUN hmm.hmm_device_private.exclusive_cow ...
# OK hmm.hmm_device_private.exclusive_cow
ok 23 hmm.hmm_device_private.exclusive_cow
# RUN hmm.hmm_device_private.hmm_gup_test ...
# OK hmm.hmm_device_private.hmm_gup_test
...

# echo always > /sys/kernel/mm/transparent_hugepage/enabled
...
# RUN hmm.hmm_device_private.exclusive ...
# hmm-tests.c:1751:exclusive:Expected ret (-16) == 0 (0)
# exclusive: Test terminated by assertion
# FAIL hmm.hmm_device_private.exclusive
not ok 21 hmm.hmm_device_private.exclusive
# RUN hmm.hmm_device_private.exclusive_mprotect ...
# hmm-tests.c:1805:exclusive_mprotect:Expected ret (-16) == 0 (0)
# exclusive_mprotect: Test terminated by assertion
# FAIL hmm.hmm_device_private.exclusive_mprotect
not ok 22 hmm.hmm_device_private.exclusive_mprotect
# RUN hmm.hmm_device_private.exclusive_cow ...
# hmm-tests.c:1858:exclusive_cow:Expected ret (-16) == 0 (0)
# exclusive_cow: Test terminated by assertion
# FAIL hmm.hmm_device_private.exclusive_cow
not ok 23 hmm.hmm_device_private.exclusive_cow


So rejecting folio_test_large() would likely achieve the same thing right now.

--
Cheers,

David / dhildenb