Re: [PATCH v1 1/1] mm/ksm: improve deduplication of zero pages with colouring

From: Christian Borntraeger
Date: Thu Jan 12 2017 - 11:21:56 EST


On 01/12/2017 05:17 PM, Claudio Imbrenda wrote:
> Some architectures have a set of zero pages (coloured zero pages)
> instead of only one zero page, in order to improve the cache
> performance. In those cases, the kernel samepage merger (KSM) would
> merge all the allocated pages that happen to be filled with zeroes to
> the same deduplicated page, thus losing all the advantages of coloured
> zero pages.
>
> This patch fixes this behaviour. When coloured zero pages are present,
> the checksum of a zero page is calculated during initialisation, and
> compared with the checksum of the current canditate during merging. In
> case of a match, the normal merging routine is used to merge the page
> with the correct coloured zero page, which ensures the candidate page
> is checked to be equal to the target zero page.
>
> This behaviour is noticeable when a process accesses large arrays of
> allocated pages containing zeroes. A test I conducted on s390 shows
> that there is a speed penalty when KSM merges such pages, compared to
> not merging them or using actual zero pages from the start without
> breaking the COW.
>
> With this patch, the performance with KSM is the same as with non
> COW-broken actual zero pages, which is also the same as without KSM.
>
> Signed-off-by: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxxxxxxx>

FWIW, I cannot say if the memory management part is correct and sane. (the
patch below). But this issue (loosing the cache colouring for the zero
page) is certainly a reason to not use KSM on s390 for specific workloads
(large sparsely matrixes backed by the guest empty zero page).

This patch will fix that.


> ---
> mm/ksm.c | 29 +++++++++++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 9ae6011..b0cfc30 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -223,6 +223,11 @@ struct rmap_item {
> /* Milliseconds ksmd should sleep between batches */
> static unsigned int ksm_thread_sleep_millisecs = 20;
>
> +#ifdef __HAVE_COLOR_ZERO_PAGE
> +/* Checksum of an empty (zeroed) page */
> +static unsigned int zero_checksum;
> +#endif
> +
> #ifdef CONFIG_NUMA
> /* Zeroed when merging across nodes is not allowed */
> static unsigned int ksm_merge_across_nodes = 1;
> @@ -1467,6 +1472,25 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
> return;
> }
>
> +#ifdef __HAVE_COLOR_ZERO_PAGE
> + /*
> + * Same checksum as an empty page. We attempt to merge it with the
> + * appropriate zero page.
> + */
> + if (checksum == zero_checksum) {
> + struct vm_area_struct *vma;
> +
> + vma = find_mergeable_vma(rmap_item->mm, rmap_item->address);
> + err = try_to_merge_one_page(vma, page,
> + ZERO_PAGE(rmap_item->address));
> + /*
> + * In case of failure, the page was not really empty, so we
> + * need to continue. Otherwise we're done.
> + */
> + if (!err)
> + return;
> + }
> +#endif
> tree_rmap_item =
> unstable_tree_search_insert(rmap_item, page, &tree_page);
> if (tree_rmap_item) {
> @@ -2304,6 +2328,11 @@ static int __init ksm_init(void)
> struct task_struct *ksm_thread;
> int err;
>
> +#ifdef __HAVE_COLOR_ZERO_PAGE
> + /* The correct value depends on page size and endianness */
> + zero_checksum = calc_checksum(ZERO_PAGE(0));
> +#endif
> +
> err = ksm_slab_init();
> if (err)
> goto out;
>