On Thu, 16 Mar 2017 15:08:46 +0800 Wei Wang <wei.w.wang@xxxxxxxxx> wrote:
From: Liang Li <liang.z.li@xxxxxxxxx>I don't think this will be useful for anything other than
This patch adds a function to provides a snapshot of the present system
unused pages. An important usage of this function is to provide the
unsused pages to the Live migration thread, which skips the transfer of
thoses unused pages. Newly used pages can be re-tracked by the dirty
page logging mechanisms.
virtio-balloon. I guess it would be better to keep this code in the
virtio-balloon driver if possible, even though that's rather a layering
violation :( What would have to be done to make that possible? Perhaps
we can put some *small* helpers into page_alloc.c to prevent things
from becoming too ugly.
--- a/mm/page_alloc.cThis looks like it could disable interrupts for a long time. Too long?
+++ b/mm/page_alloc.c
@@ -4498,6 +4498,120 @@ void show_free_areas(unsigned int filter)
show_swap_cache_info();
}
+static int __record_unused_pages(struct zone *zone, int order,
+ __le64 *buf, unsigned int size,
+ unsigned int *offset, bool part_fill)
+{
+ unsigned long pfn, flags;
+ int t, ret = 0;
+ struct list_head *curr;
+ __le64 *chunk;
+
+ if (zone_is_empty(zone))
+ return 0;
+
+ spin_lock_irqsave(&zone->lock, flags);
+
+ if (*offset + zone->free_area[order].nr_free > size && !part_fill) {
+ ret = -ENOSPC;
+ goto out;
+ }
+ for (t = 0; t < MIGRATE_TYPES; t++) {
+ list_for_each(curr, &zone->free_area[order].free_list[t]) {
+ pfn = page_to_pfn(list_entry(curr, struct page, lru));
+ chunk = buf + *offset;
+ if (*offset + 2 > size) {
+ ret = -ENOSPC;
+ goto out;
+ }
+ /* Align to the chunk format used in virtio-balloon */
+ *chunk = cpu_to_le64(pfn << 12);
+ *(chunk + 1) = cpu_to_le64((1 << order) << 12);
+ *offset += 2;
+ }
+ }
+
+out:
+ spin_unlock_irqrestore(&zone->lock, flags);
+
+ return ret;
+}
+/*It's a strange thing - it returns information which will instantly
+ * The record_unused_pages() function is used to record the system unused
+ * pages. The unused pages can be skipped to transfer during live migration.
+ * Though the unused pages are dynamically changing, dirty page logging
+ * mechanisms are able to capture the newly used pages though they were
+ * recorded as unused pages via this function.
+ *
+ * This function scans the free page list of the specified order to record
+ * the unused pages, and chunks those continuous pages following the chunk
+ * format below:
+ * --------------------------------------
+ * | Base (52-bit) | Rsvd (12-bit) |
+ * --------------------------------------
+ * --------------------------------------
+ * | Size (52-bit) | Rsvd (12-bit) |
+ * --------------------------------------
+ *
+ * @start_zone: zone to start the record operation.
+ * @order: order of the free page list to record.
+ * @buf: buffer to record the unused page info in chunks.
+ * @size: size of the buffer in __le64 to record
+ * @offset: offset in the buffer to record.
+ * @part_fill: indicate if partial fill is used.
+ *
+ * return -EINVAL if parameter is invalid
+ * return -ENOSPC when the buffer is too small to record all the unsed pages
+ * return 0 when sccess
+ */
become incorrect.