Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

From: Michael S. Tsirkin
Date: Thu Dec 15 2016 - 10:55:11 EST


On Thu, Dec 15, 2016 at 07:34:33AM -0800, Dave Hansen wrote:
> On 12/14/2016 12:59 AM, Li, Liang Z wrote:
> >> Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for
> >> fast (de)inflating & fast live migration
> >>
> >> On 12/08/2016 08:45 PM, Li, Liang Z wrote:
> >>> What's the conclusion of your discussion? It seems you want some
> >>> statistic before deciding whether to ripping the bitmap from the ABI,
> >>> am I right?
> >>
> >> I think Andrea and David feel pretty strongly that we should remove the
> >> bitmap, unless we have some data to support keeping it. I don't feel as
> >> strongly about it, but I think their critique of it is pretty valid. I think the
> >> consensus is that the bitmap needs to go.
> >>
> >> The only real question IMNHO is whether we should do a power-of-2 or a
> >> length. But, if we have 12 bits, then the argument for doing length is pretty
> >> strong. We don't need anywhere near 12 bits if doing power-of-2.
> >
> > Just found the MAX_ORDER should be limited to 12 if use length instead of order,
> > If the MAX_ORDER is configured to a value bigger than 12, it will make things more
> > complex to handle this case.
> >
> > If use order, we need to break a large memory range whose length is not the power of 2 into several
> > small ranges, it also make the code complex.
>
> I can't imagine it makes the code that much more complex. It adds a for
> loop. Right?
>
> > It seems we leave too many bit for the pfn, and the bits leave for length is not enough,
> > How about keep 45 bits for the pfn and 19 bits for length, 45 bits for pfn can cover 57 bits
> > physical address, that should be enough in the near feature.
> >
> > What's your opinion?
>
> I still think 'order' makes a lot of sense. But, as you say, 57 bits is
> enough for x86 for a while. Other architectures.... who knows?

I think you can probably assume page size >= 4K. But I would not want
to make any other assumptions. E.g. there are systems that absolutely
require you to set high bits for DMA.

I think we really want both length and order.

I understand how you are trying to pack them as tightly as possible.

However, I thought of a trick, we don't need to encode all
possible orders. For example, with 2 bits of order,
we can make them mean:
00 - 4K pages
01 - 2M pages
02 - 1G pages

guest can program the sizes for each order through config space.

We will have 10 bits left for legth.

It might make sense to also allow guest to program the number of bits
used for order, this will make it easy to extend without
host changes.

--
MST