[PATCH v1 1/1] mm/ksm: improve deduplication of zero pages with colouring

From: Claudio Imbrenda
Date: Thu Jan 12 2017 - 11:17:26 EST


Some architectures have a set of zero pages (coloured zero pages)
instead of only one zero page, in order to improve the cache
performance. In those cases, the kernel samepage merger (KSM) would
merge all the allocated pages that happen to be filled with zeroes to
the same deduplicated page, thus losing all the advantages of coloured
zero pages.

This patch fixes this behaviour. When coloured zero pages are present,
the checksum of a zero page is calculated during initialisation, and
compared with the checksum of the current canditate during merging. In
case of a match, the normal merging routine is used to merge the page
with the correct coloured zero page, which ensures the candidate page
is checked to be equal to the target zero page.

This behaviour is noticeable when a process accesses large arrays of
allocated pages containing zeroes. A test I conducted on s390 shows
that there is a speed penalty when KSM merges such pages, compared to
not merging them or using actual zero pages from the start without
breaking the COW.

With this patch, the performance with KSM is the same as with non
COW-broken actual zero pages, which is also the same as without KSM.

Signed-off-by: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxxxxxxx>
---
mm/ksm.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)

diff --git a/mm/ksm.c b/mm/ksm.c
index 9ae6011..b0cfc30 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -223,6 +223,11 @@ struct rmap_item {
/* Milliseconds ksmd should sleep between batches */
static unsigned int ksm_thread_sleep_millisecs = 20;

+#ifdef __HAVE_COLOR_ZERO_PAGE
+/* Checksum of an empty (zeroed) page */
+static unsigned int zero_checksum;
+#endif
+
#ifdef CONFIG_NUMA
/* Zeroed when merging across nodes is not allowed */
static unsigned int ksm_merge_across_nodes = 1;
@@ -1467,6 +1472,25 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
return;
}

+#ifdef __HAVE_COLOR_ZERO_PAGE
+ /*
+ * Same checksum as an empty page. We attempt to merge it with the
+ * appropriate zero page.
+ */
+ if (checksum == zero_checksum) {
+ struct vm_area_struct *vma;
+
+ vma = find_mergeable_vma(rmap_item->mm, rmap_item->address);
+ err = try_to_merge_one_page(vma, page,
+ ZERO_PAGE(rmap_item->address));
+ /*
+ * In case of failure, the page was not really empty, so we
+ * need to continue. Otherwise we're done.
+ */
+ if (!err)
+ return;
+ }
+#endif
tree_rmap_item =
unstable_tree_search_insert(rmap_item, page, &tree_page);
if (tree_rmap_item) {
@@ -2304,6 +2328,11 @@ static int __init ksm_init(void)
struct task_struct *ksm_thread;
int err;

+#ifdef __HAVE_COLOR_ZERO_PAGE
+ /* The correct value depends on page size and endianness */
+ zero_checksum = calc_checksum(ZERO_PAGE(0));
+#endif
+
err = ksm_slab_init();
if (err)
goto out;
--
1.9.1