On Thu, Feb 08, 2018 at 02:39:26PM -0800, Andrew Morton wrote:
On Tue, 6 Feb 2018 08:06:36 +0800 Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> wrote:Yeah, the patch is wrong. We need to get all ptes for THP cleared.
For PTE-mapped THP, the compound THP has not been split to normal 4KThis means that the function will no longer clear the referenced bits
pages yet, the whole THP is considered referenced if any one of sub
page is referenced.
When walking PTE-mapped THP by pvmw, all relevant PTEs will be checked
to retrieve referenced bit. But, the current code just returns the
result of the last PTE. If the last PTE has not referenced, the
referenced flag will be cleared.
So, here just break pvmw walk once referenced PTE is found if the page
is a part of THP.
...
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -67,6 +67,14 @@ static bool page_idle_clear_pte_refs_one(struct page *page,
if (pvmw.pte) {
referenced = ptep_clear_young_notify(vma, addr,
pvmw.pte);
+ /*
+ * For PTE-mapped THP, one sub page is referenced,
+ * the whole THP is referenced.
+ */
+ if (referenced && PageTransCompound(pvmw.page)) {
+ page_vma_mapped_walk_done(&pvmw);
+ break;
+ }
in all the ptes. What effect does this have and should we document
this in some fashion?
What about something like this instead (untested):
diff --git a/mm/page_idle.c b/mm/page_idle.c
index 0a49374e6931..6876522c9dce 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -65,10 +65,10 @@ static bool page_idle_clear_pte_refs_one(struct page *page,
while (page_vma_mapped_walk(&pvmw)) {
addr = pvmw.address;
if (pvmw.pte) {
- referenced = ptep_clear_young_notify(vma, addr,
+ referenced |= ptep_clear_young_notify(vma, addr,
pvmw.pte);
} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
- referenced = pmdp_clear_young_notify(vma, addr,
+ referenced |= pmdp_clear_young_notify(vma, addr,
pvmw.pmd);
} else {
/* unexpected pmd-mapped page? */