Re: [PATCH v15 4/7] mm: Introduce Reported pages
From: Nitesh Narayan Lal
Date: Mon Dec 16 2019 - 06:45:16 EST
On 12/5/19 11:22 AM, Alexander Duyck wrote:
> From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
>
> In order to pave the way for free page reporting in virtualized
> environments we will need a way to get pages out of the free lists and
> identify those pages after they have been returned. To accomplish this,
> this patch adds the concept of a Reported Buddy, which is essentially
> meant to just be the Uptodate flag used in conjunction with the Buddy
> page type.
>
> To prevent the reported pages from leaking outside of the buddy lists I
> added a check to clear the PageReported bit in the del_page_from_free_list
> function. As a result any reported page that is split, merged, or
> allocated will have the flag cleared prior to the PageBuddy value being
> cleared.
>
> The process for reporting pages is fairly simple. Once we free a page that
> meets the minimum order for page reporting we will schedule a worker thread
> to start 2s or more in the future. That worker thread will begin working
> from the lowest supported page reporting order up to MAX_ORDER - 1 pulling
> unreported pages from the free list and storing them in the scatterlist.
>
> When processing each individual free list it is necessary for the worker
> thread to release the zone lock when it needs to stop and report the full
> scatterlist of pages. To reduce the work of the next iteration the worker
> thread will rotate the free list so that the first unreported page in the
> free list becomes the first entry in the list.
[...]
> k);
> +
> + return err;
> +}
> +
> +static int
> +page_reporting_process_zone(struct page_reporting_dev_info *prdev,
> + struct scatterlist *sgl, struct zone *zone)
> +{
> + unsigned int order, mt, leftover, offset = PAGE_REPORTING_CAPACITY;
> + unsigned long watermark;
> + int err = 0;
> +
> + /* Generate minimum watermark to be able to guarantee progress */
> + watermark = low_wmark_pages(zone) +
> + (PAGE_REPORTING_CAPACITY << PAGE_REPORTING_MIN_ORDER);
> +
> + /*
> + * Cancel request if insufficient free memory or if we failed
> + * to allocate page reporting statistics for the zone.
> + */
> + if (!zone_watermark_ok(zone, 0, watermark, 0, ALLOC_CMA))
> + return err;
> +
Will it not make more sense to check the low watermark condition before every
reporting request generated for a bunch of 32 isolated pages?
or will that be too costly?
> + /* Process each free list starting from lowest order/mt */
> + for (order = PAGE_REPORTING_MIN_ORDER; order < MAX_ORDER; order++) {
> + for (mt = 0; mt < MIGRATE_TYPES; mt++) {
> + /* We do not pull pages from the isolate free list */
> + if (is_migrate_isolate(mt))
> + continue;
> +
> + err = page_reporting_cycle(prdev, zone, order, mt,
> + sgl, &offset);
> + if (err)
> + return err;
> + }
> + }
> +
> + /* report the leftover pages before going idle */
> + leftover = PAGE_REPORTING_CAPACITY - offset;
> + if (leftover) {
> + sgl = &sgl[offset];
> + err = prdev->report(prdev, sgl, leftover);
> +
> + /* flush any remaining pages out from the last report */
> + spin_lock_irq(&zone->lock);
> + page_reporting_drain(prdev, sgl, leftover, !err);
> + spin_unlock_irq(&zone->lock);
> + }
> +
> + return err;
> +}
--
Nitesh