Re: [PATCH] vhost: support upto 509 memory regions

From: Andrey Korolyov
Date: Mon May 18 2015 - 12:23:24 EST


On Wed, Feb 18, 2015 at 7:27 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> On Tue, Feb 17, 2015 at 04:53:45PM -0800, Eric Northup wrote:
>> On Tue, Feb 17, 2015 at 4:32 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
>> > On Tue, Feb 17, 2015 at 11:59:48AM +0100, Paolo Bonzini wrote:
>> >>
>> >>
>> >> On 17/02/2015 10:02, Michael S. Tsirkin wrote:
>> >> > > Increasing VHOST_MEMORY_MAX_NREGIONS from 65 to 509
>> >> > > to match KVM_USER_MEM_SLOTS fixes issue for vhost-net.
>> >> > >
>> >> > > Signed-off-by: Igor Mammedov <imammedo@xxxxxxxxxx>
>> >> >
>> >> > This scares me a bit: each region is 32byte, we are talking
>> >> > a 16K allocation that userspace can trigger.
>> >>
>> >> What's bad with a 16K allocation?
>> >
>> > It fails when memory is fragmented.
>> >
>> >> > How does kvm handle this issue?
>> >>
>> >> It doesn't.
>> >>
>> >> Paolo
>> >
>> > I'm guessing kvm doesn't do memory scans on data path,
>> > vhost does.
>> >
>> > qemu is just doing things that kernel didn't expect it to need.
>> >
>> > Instead, I suggest reducing number of GPA<->HVA mappings:
>> >
>> > you have GPA 1,5,7
>> > map them at HVA 11,15,17
>> > then you can have 1 slot: 1->11
>> >
>> > To avoid libc reusing the memory holes, reserve them with MAP_NORESERVE
>> > or something like this.
>>
>> This works beautifully when host virtual address bits are more
>> plentiful than guest physical address bits. Not all architectures
>> have that property, though.
>
> AFAIK this is pretty much a requirement for both kvm and vhost,
> as we require each guest page to also be mapped in qemu memory.
>
>> > We can discuss smarter lookup algorithms but I'd rather
>> > userspace didn't do things that we then have to
>> > work around in kernel.
>> >
>> >
>> > --
>> > MST
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe kvm" in
>> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Hello,

any chance of getting the proposed patch in the mainline? Though it
seems that most users will not suffer from relatively slot number
ceiling (they can decrease slot 'granularity' for larger VMs and
vice-versa), fine slot size, 256M or even 128M, with the large number
of slots can be useful for a certain kind of tasks for an
orchestration systems. I`ve made a backport series of all seemingly
interesting memslot-related improvements to a 3.10 branch, is it worth
to be tested with straighforward patch like one from above, with
simulated fragmentation of allocations in host?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/