Re: [PATCH v12 6/8] mm: support reporting free page blocks

From: Wei Wang
Date: Wed Jul 26 2017 - 07:41:50 EST


On 07/26/2017 06:24 PM, Michal Hocko wrote:
On Wed 26-07-17 10:22:23, Wei Wang wrote:
On 07/25/2017 10:53 PM, Michal Hocko wrote:
On Tue 25-07-17 14:47:16, Wang, Wei W wrote:
On Tuesday, July 25, 2017 8:42 PM, hal Hocko wrote:
On Tue 25-07-17 19:56:24, Wei Wang wrote:
On 07/25/2017 07:25 PM, Michal Hocko wrote:
On Tue 25-07-17 17:32:00, Wei Wang wrote:
On 07/24/2017 05:00 PM, Michal Hocko wrote:
On Wed 19-07-17 20:01:18, Wei Wang wrote:
On 07/19/2017 04:13 PM, Michal Hocko wrote:
[...
We don't need to do the pfn walk in the guest kernel. When the API
reports, for example, a 2MB free page block, the API caller offers to
the hypervisor the base address of the page block, and size=2MB, to
the hypervisor.
So you want to skip pfn walks by regularly calling into the page allocator to
update your bitmap. If that is the case then would an API that would allow you
to update your bitmap via a callback be s sufficient? Something like
void walk_free_mem(int node, int min_order,
void (*visit)(unsigned long pfn, unsigned long nr_pages))

The function will call the given callback for each free memory block on the given
node starting from the given min_order. The callback will be strictly an atomic
and very light context. You can update your bitmap from there.
I would need to introduce more about the background here:
The hypervisor and the guest live in their own address space. The hypervisor's bitmap
isn't seen by the guest. I think we also wouldn't be able to give a callback function
>from the hypervisor to the guest in this case.
How did you plan to use your original API which export struct page array
then?

That's where the virtio-balloon driver comes in. It uses a shared ring
mechanism to
send the guest memory info to the hypervisor.

We didn't expose the struct page array from the guest to the hypervisor. For
example, when
a 2MB free page block is reported from the free page list, the info put on
the ring is just
(base address of the 2MB continuous memory, size=2M).
So what exactly prevents virtio-balloon from using the above proposed
callback mechanism and export what is needed to the hypervisor?

I thought about it more. Probably we can use the callback function with a little change like this:

void walk_free_mem(void *opaque1, void (*visit)(void *opaque2, unsigned long pfn,
unsigned long nr_pages))
{
...
for_each_populated_zone(zone) {
for_each_migratetype_order(order, type) {
report_unused_page_block(zone, order, type, &page); // from patch 6
pfn = page_to_pfn(page);
visit(opaque1, pfn, 1 << order);
}
}
}

The above function scans all the free list and directly sends each free page block to the
hypervisor via the virtio_balloon callback below. No need to implement a bitmap.

In virtio-balloon, we have the callback:
void *virtio_balloon_report_unused_pages(void *opaque, unsigned long pfn,
unsigned long nr_pages)
{
struct virtio_balloon *vb = (struct virtio_balloon *)opaque;
...put the free page block to the the ring of vb;
}


What do you think?


Best,
Wei