Re: [PATCH 2/4] Add replace_page(), change the mapping of pte fromone page into another

From: Izik Eidus
Date: Tue Nov 11 2008 - 15:59:18 EST


Andrew Morton wrote:
On Tue, 11 Nov 2008 15:21:39 +0200
Izik Eidus <ieidus@xxxxxxxxxx> wrote:

From: Izik Eidus <izike@xxxxxxxxxxxx>

this function is needed in cases you want to change the userspace
virtual mapping into diffrent physical page,

Not sure that I understand that description. We want to replace a live
page in an anonymous VMA with a different one?

It looks that way.
yes but it replace it with kernel allocated page.
page migration already kinda does that. Is there common ground?

page migration as far as i saw cant migrate anonymous page into kernel page.
if you want we can change page_migration to do that, but i thought you will rather have ksm changes separate.

KSM need this for merging the identical pages.

this function is working by removing the oldpage from the rmap and
calling put_page on it, and by setting the virtual address pte
to point into the new page.
(note that the new page (the page that we change the pte to map to)
cannot be anonymous page)


I don't understand the restrictions on anonymous pages. Please expand
the changelog so that reviewers can understand the reasons for this
restriction.
the page that we are going to map into the pte going to be nonlinear from the point of view of anon-vma
therefore it cannot be anonymous.


---
include/linux/mm.h | 3 ++
mm/memory.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 71 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ffee2f7..4da7fa8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1207,6 +1207,9 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
+int replace_page(struct vm_area_struct *vma, struct page *oldpage,
+ struct page *newpage, pte_t orig_pte, pgprot_t prot);
+
struct page *follow_page(struct vm_area_struct *, unsigned long address,
unsigned int foll_flags);
#define FOLL_WRITE 0x01 /* check pte is writable */
diff --git a/mm/memory.c b/mm/memory.c
index 164951c..b2c542c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1472,6 +1472,74 @@ int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
}
EXPORT_SYMBOL(vm_insert_mixed);
+/**
+ * replace _page - replace the pte mapping related to vm area between two pages

s/replace _page/replace_page/

+ * (from oldpage to newpage)
+ * NOTE: you should take into consideration the impact on the VM when replacing
+ * anonymous pages with kernel non swappable pages.
+ */

This _is_ a kerneldoc comment, but kernedoc comments conventionally
document the arguments and the return value also.

+int replace_page(struct vm_area_struct *vma, struct page *oldpage,
+ struct page *newpage, pte_t orig_pte, pgprot_t prot)
+{
+ struct mm_struct *mm = vma->vm_mm;
+ pgd_t *pgd;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *ptep;
+ spinlock_t *ptl;
+ unsigned long addr;
+ int ret;
+
+ BUG_ON(PageAnon(newpage));
+
+ ret = -EFAULT;
+ addr = page_address_in_vma(oldpage, vma);
+ if (addr == -EFAULT)
+ goto out;
+
+ pgd = pgd_offset(mm, addr);
+ if (!pgd_present(*pgd))
+ goto out;
+
+ pud = pud_offset(pgd, addr);
+ if (!pud_present(*pud))
+ goto out;
+
+ pmd = pmd_offset(pud, addr);
+ if (!pmd_present(*pmd))
+ goto out;
+
+ ptep = pte_offset_map_lock(mm, pmd, addr, &ptl);
+ if (!ptep)
+ goto out;
+
+ if (!pte_same(*ptep, orig_pte)) {
+ pte_unmap_unlock(ptep, ptl);
+ goto out;
+ }
+
+ ret = 0;
+ get_page(newpage);
+ page_add_file_rmap(newpage);
+
+ flush_cache_page(vma, addr, pte_pfn(*ptep));
+ ptep_clear_flush(vma, addr, ptep);
+ set_pte_at(mm, addr, ptep, mk_pte(newpage, prot));
+
+ page_remove_rmap(oldpage, vma);
+ if (PageAnon(oldpage)) {
+ dec_mm_counter(mm, anon_rss);
+ inc_mm_counter(mm, file_rss);
+ }
+ put_page(oldpage);
+
+ pte_unmap_unlock(ptep, ptl);
+
+out:
+ return ret;
+}
+EXPORT_SYMBOL(replace_page);

Again, we could make the presence of this code selectable by subsystems
which want it.

]

sure.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/