Re: [virtio-dev] Re: [PATCH v25 2/2] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

From: Wei Wang
Date: Thu Feb 01 2018 - 04:41:01 EST


On 01/31/2018 07:44 AM, Michael S. Tsirkin wrote:
On Fri, Jan 26, 2018 at 11:31:19AM +0800, Wei Wang wrote:
On 01/26/2018 10:42 AM, Michael S. Tsirkin wrote:
On Fri, Jan 26, 2018 at 09:40:44AM +0800, Wei Wang wrote:
On 01/25/2018 09:49 PM, Michael S. Tsirkin wrote:
On Thu, Jan 25, 2018 at 05:14:06PM +0800, Wei Wang wrote:

The controversy is that the free list is not static
once the lock is dropped, so everything is dynamically changing, including
the state that was recorded. The method we are using is more prudent, IMHO.
How about taking the fundamental solution, and seek to improve incrementally
in the future?


Best,
Wei
I'd like to see kicks happen outside the spinlock. kick with a spinlock
taken looks like a scalability issue that won't be easy to
reproduce but hurt workloads at random unexpected times.

Is that "kick inside the spinlock" the only concern you have? I think we can
remove the kick actually. If we check how the host side works, it is
worthwhile to let the host poll the virtqueue after it receives the cmd id
from the guest (kick for cmd id isn't within the lock).


Best,
Wei
So really there are different ways to put free page hints to use.

The current interface requires host to do dirty tracking
for all memory, and it's more or less useless for
things like freeing host memory.

So while your project's needs seem to be addressed, I'm
still a bit disappointed that so little collaboration
happened with e.g. Nitesh's project, to the point where
you don't even CC him on patches.

Isn't "nilal@xxxxxxxxxx" Nitesh? Actually it's been cc-ed long time ago.

I think we should at least see the performance numbers and a working prototype from them (I remember they lack the host side implementation).

Btw, this feature is requested by many customers of Linux (not our own project's need). They want to use this feature to optimize their *live migration*. Hope the community could understand our need.


So I'm kind of trying to bridge this a bit - I would
like the interfaces that we build to at least superficially
look like they might be reusable for other uses of hinting.

Imagine that you don't have dirty tracking on the host.
What would it take to still use hinting information,
e.g. to call MADV_FREE on the pages guest gives us?

I think you need to kick and you need to wait for
host to consume the hint before page is reused.
And we know madvise takes a lot of time sometimes,
so locking out the free list does not sound like a
good idea.

That's why I was talking about kick out of lock,
so that eventually we can reuse that for hinting
and actually wait for an interrupt.

So how about we take a bunch of pages out of the free list, move them to
the balloon, kick (and optionally wait for host to consume), them move
them back? Preferably to end of the list? This will also make things
like sorting them much easier as you can just put them in a binary tree
or something.

For when we need to be careful to make sure we don't
create an OOM situation with this out of thin air,
and for when you can't give everything to host in one go,
you might want some kind of notifier that tells you
that you need to return pages to the free list ASAP.

How'd this sound?


I think the above is a duplicate function of ballooning, though there are some differences. Please see below my concerns and different thoughts:

1) From the previous discussion, the only acceptable method to get pages from mm is to do alloc() (btw, we are not getting pages in this patch, we are getting hints). The above sounds like we are going to take pages from the free list without mm's awareness. I'm not sure if you would be ready to convince the mm folks that this idea is allowed.

2) If the guest has 8G free memory, how much can virtio-balloon take with the above method? For example, if virtio-balloon only takes 1G, with 7G left in mm. The next moment, it is possible that something comes out and needs to use 7.5GB. I think it is barely possible to ensure that the amount of memory we take to virtio-balloon won't affect the system.

3) Hints means the pages are quite likely to be free pages (no guarantee). If the pages given to host are going to be freed, then we really couldn't call them hints, they are true free pages. Ballooning needs true free pages, while live migration needs hints, would you agree with this? From the perspective of features, they are two different features, and should be gated with two feature bits and separated implementations. Mixing them would cause many unexpected issues (e.g. the case when the two features function at the same time)

4) If we want to add another function of ballooning, how is this better than the existing ballooning? The difference I can see is the current ballooning takes free pages via alloc(), while the above hacks into the free page list.


Best,
Wei