On Thu, Oct 08, 2020 at 10:25:44AM -0700, Ralph Campbell wrote:
ZONE_DEVICE struct pages have an extra reference count that complicates the
code for put_page() and several places in the kernel that need to check the
reference count to see that a page is not being used (gup, compaction,
migration, etc.). Clean up the code so the reference count doesn't need to
be treated specially for device private pages, leaving DAX as still being
a special case.
What about the check in mc_handle_swap_pte()?
mm/memcontrol.c:
5513 /*
5514 * MEMORY_DEVICE_PRIVATE means ZONE_DEVICE page and which have
5515 * a refcount of 1 when free (unlike normal page)
5516 */
5517 if (!page_ref_add_unless(page, 1, 1))
5518 return NULL;
... does that need to change? Perhaps just the comment?
Signed-off-by: Ralph Campbell <rcampbell@xxxxxxxxxx>
---
[snip]
void put_devmap_managed_page(struct page *page);
diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index e151a7f10519..bf92a261fa6f 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -509,10 +509,15 @@ static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
mdevice->devmem_count * (DEVMEM_CHUNK_SIZE / (1024 * 1024)),
pfn_first, pfn_last);
+ /*
+ * Pages are created with an initial reference count of one but should
+ * have a reference count of zero while in the free state.
+ */
spin_lock(&mdevice->lock);
for (pfn = pfn_first; pfn < pfn_last; pfn++) {
struct page *page = pfn_to_page(pfn);
+ set_page_count(page, 0);
This confuses me. How does this and init_page_count() not confuse the buddy
allocator? Don't you have to reset the refcount somewhere after the test?
page->zone_device_data = mdevice->free_pages;
mdevice->free_pages = page;
}
@@ -561,7 +566,7 @@ static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice)
}
dpage->zone_device_data = rpage;
- get_page(dpage);
+ init_page_count(dpage);
lock_page(dpage);
return dpage;
diff --git a/mm/internal.h b/mm/internal.h
index c43ccdddb0f6..e1443b73aa9b 100644
--- a/mm/internal.h
+++ b/mm/internal.h
[snip]
diff --git a/mm/swap.c b/mm/swap.c
index 0eb057141a04..93d880c6f73c 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -116,12 +116,11 @@ static void __put_compound_page(struct page *page)
void __put_page(struct page *page)
{
if (is_zone_device_page(page)) {
- put_dev_pagemap(page->pgmap);
-
/*
* The page belongs to the device that created pgmap. Do
* not return it to page allocator.
*/
+ free_zone_device_page(page);
I really like this.
Ira