Re: [PATCHv5 5/7] mm: make compound_head() robust

From: Vlastimil Babka
Date: Mon Sep 14 2015 - 09:31:16 EST


On 09/11/2015 03:35 PM, Kirill A. Shutemov wrote:
index 097c7a4bfbd9..330377f83ac7 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1686,8 +1686,7 @@ static void __split_huge_page_refcount(struct page *page,
(1L << PG_unevictable)));
page_tail->flags |= (1L << PG_dirty);

- /* clear PageTail before overwriting first_page */
- smp_wmb();
+ clear_compound_head(page_tail);

I would sleep better if this was done before setting all the page->flags above,
previously, PageTail was cleared by the first operation which is
"page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;"
I do realize that it doesn't use WRITE_ONCE, so it might have been theoretically
broken already, if it does matter.

Right. Nothing enforces particular order. If we really need to have some
ordering on PageTail() vs. page->flags let's define it, but so far I
don't see a reason to change this part.

OK.

diff --git a/mm/internal.h b/mm/internal.h
index 36b23f1e2ca6..89e21a07080a 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -61,9 +61,9 @@ static inline void __get_page_tail_foll(struct page *page,
* speculative page access (like in
* page_cache_get_speculative()) on tail pages.
*/
- VM_BUG_ON_PAGE(atomic_read(&page->first_page->_count) <= 0, page);
+ VM_BUG_ON_PAGE(atomic_read(&compound_head(page)->_count) <= 0, page);
if (get_page_head)
- atomic_inc(&page->first_page->_count);
+ atomic_inc(&compound_head(page)->_count);

Doing another compound_head() seems like overkill when this code already assumes
PageTail.

"Overkill"? It's too strong wording for re-read hot cache line.

Hm good point.

All callers do it after if (PageTail()) which means they already did
READ_ONCE(page->compound_head) and here they do another one. Potentially with
different result in bit 0, which would be a subtle bug, that could be
interesting to catch with some VM_BUG_ON. I don't know if a direct plain
page->compound_head access here instead of compound_head() would also result in
a re-read, since the previous access did use READ_ONCE(). Maybe it would be best
to reorganize the code here and in the 3 call sites so that the READ_ONCE() used
to determine PageTail also obtains the compound head pointer.

All we would possbily win by the change is few bytes in code. Additional
READ_ONCE() only affect code generation. It doesn't introduce any cpu
barriers. The cache line with compound_head is in L1 anyway.

I don't see justification to change this part too. If you think we can
gain something by reworking this code, feel free to propose patch on top.

OK, it's probably not worth:

add/remove: 0/0 grow/shrink: 1/3 up/down: 7/-66 (-59)
function old new delta
follow_trans_huge_pmd 516 523 +7
follow_page_pte 905 893 -12
follow_hugetlb_page 943 919 -24
__get_page_tail 440 410 -30


diff --git a/mm/swap.c b/mm/swap.c
index a3a0a2f1f7c3..faa9e1687dea 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -200,7 +200,7 @@ out_put_single:
__put_single_page(page);
return;
}
- VM_BUG_ON_PAGE(page_head != page->first_page, page);
+ VM_BUG_ON_PAGE(page_head != compound_head(page), page);
/*
* We can release the refcount taken by
* get_page_unless_zero() now that
@@ -261,7 +261,7 @@ static void put_compound_page(struct page *page)
* Case 3 is possible, as we may race with
* __split_huge_page_refcount tearing down a THP page.
*/
- page_head = compound_head_by_tail(page);
+ page_head = compound_head(page);

This is also in a path after PageTail() is true.

We can only save one branch here: other PageTail() is most likely in other
compilation unit and compiler would not be able to eliminate additional
load.
Why bother?

Hmm, right. Bah.

add/remove: 0/0 grow/shrink: 1/0 up/down: 8/0 (8)
function old new delta
put_compound_page 500 508 +8



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/