Re: [RFC PATCH 0/2] SKSM: Synchronous Kernel Samepage Merging

From: Mathieu Desnoyers
Date: Fri Feb 28 2025 - 12:53:39 EST


On 2025-02-28 11:32, Peter Xu wrote:
On Fri, Feb 28, 2025 at 09:59:00AM -0500, Mathieu Desnoyers wrote:
For the VM use-case, I wonder if we could just add a userfaultfd
"COW" event that would notify userspace when a COW happens ?

I don't know what's the best for KSM and how well this will work, but we
have such event for years.. See UFFDIO_REGISTER_MODE_WP:

https://man7.org/linux/man-pages/man2/userfaultfd.2.html

userfaultfd UFFDIO_REGISTER only seems to work if I pass an address
resulting from a mmap mapping, but returns EINVAL if I pass a
page-aligned address which sits within a private file mapping
(e.g. executable data).

Also, I notice that do_wp_page() only calls handle_userfault
VM_UFFD_WP when vm_fault flags does not have FAULT_FLAG_UNSHARE
set.

AFAIU, as it stands now userfaultfd would not help tracking COW faults
caused by stores to private file mappings. Am I missing something ?

Thanks,

Mathieu



This would allow userspace to replace ksmd by tracking the age of
those anonymous pages, and issue madvise MADV_MERGE on them to
write-protect+merge them when it is deemed useful.

With both a new userfaultfd COW event and madvise MADV_MERGE,
is there anything else that is fundamentally missing to move
all the scanning complexity of KSM to userspace for the VM
deduplication use-case ?

Thanks,



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com