Re: [PATCH 00/14] Small step toward KSM for file back page.

From: Matthew Wilcox
Date: Wed Oct 07 2020 - 18:09:21 EST


On Wed, Oct 07, 2020 at 01:54:19PM -0400, Jerome Glisse wrote:
> > For other things (NUMA distribution), we can point to something which
> > isn't a struct page and can be distiguished from a real struct page by a
> > bit somewhere (I have ideas for at least three bits in struct page that
> > could be used for this). Then use a pointer in that data structure to
> > point to the real page. Or do NUMA distribution at the inode level.
> > Have a way to get from (inode, node) to an address_space which contains
> > just regular pages.
>
> How do you find all the copies ? KSM maintains a list for a reasons.
> Same would be needed here because if you want to break the write prot
> you need to find all the copy first. If you intend to walk page table
> then how do you synchronize to avoid more copy to spawn while you
> walk reverse mapping, we could lock the struct page i guess. Also how
> do you walk device page table which are completely hidden from core mm.

So ... why don't you put a PageKsm page in the page cache? That way you
can share code with the current KSM implementation. You'd need
something like this:

+++ b/mm/filemap.c
@@ -1622,6 +1622,9 @@ struct page *find_lock_entry(struct address_space *mapping
, pgoff_t index)
lock_page(page);
/* Has the page been truncated? */
if (unlikely(page->mapping != mapping)) {
+ if (PageKsm(page)) {
+ ...
+ }
unlock_page(page);
put_page(page);
goto repeat;
@@ -1655,6 +1658,7 @@ struct page *find_lock_entry(struct address_space *mapping, pgoff_t index)
* * %FGP_WRITE - The page will be written
* * %FGP_NOFS - __GFP_FS will get cleared in gfp mask
* * %FGP_NOWAIT - Don't get blocked by page lock
+ * * %FGP_KSM - Return KSM pages
*
* If %FGP_LOCK or %FGP_CREAT are specified then the function may sleep even
* if the %GFP flags specified for %FGP_CREAT are atomic.
@@ -1687,6 +1691,11 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,

/* Has the page been truncated? */
if (unlikely(page->mapping != mapping)) {
+ if (PageKsm(page) {
+ if (fgp_flags & FGP_KSM)
+ return page;
+ ...
+ }
unlock_page(page);
put_page(page);
goto repeat;

I don't know what you want to do when you find a KSM page, so I just left
an ellipsis.