Re: [PATCH] vhost: support upto 509 memory regions

From: Michael S. Tsirkin
Date: Tue Feb 17 2015 - 08:29:43 EST

On Tue, Feb 17, 2015 at 02:11:37PM +0100, Paolo Bonzini wrote:
> On 17/02/2015 13:32, Michael S. Tsirkin wrote:
> > On Tue, Feb 17, 2015 at 11:59:48AM +0100, Paolo Bonzini wrote:
> >>
> >>
> >> On 17/02/2015 10:02, Michael S. Tsirkin wrote:
> >>>> Increasing VHOST_MEMORY_MAX_NREGIONS from 65 to 509
> >>>> to match KVM_USER_MEM_SLOTS fixes issue for vhost-net.
> >>>>
> >>>> Signed-off-by: Igor Mammedov <imammedo@xxxxxxxxxx>
> >>>
> >>> This scares me a bit: each region is 32byte, we are talking
> >>> a 16K allocation that userspace can trigger.
> >>
> >> What's bad with a 16K allocation?
> >
> > It fails when memory is fragmented.
> If memory is _that_ fragmented I think you have much bigger problems
> than vhost.
> > I'm guessing kvm doesn't do memory scans on data path, vhost does.
> It does for MMIO memory-to-memory writes, but that's not a particularly
> fast path.
> KVM doesn't access the memory map on fast paths, but QEMU does, so I
> don't think it's beyond the expectations of the kernel.

QEMU has an elaborate data structure to deal with that.

> For example you
> can use a radix tree (not lib/radix-tree.c unfortunately), and cache
> GVA->HPA translations if it turns out that lookup has become a hot path.

All vhost lookups are hot path.

> The addressing space of x86 is in practice 44 bits or fewer, and each
> slot will typically be at least 1 GiB, so you only have 14 bits to
> dispatch on. It's probably possible to only have two or three levels
> in the radix tree in the common case, and beat the linear scan real quick.

Not if there are about 6 regions, I think.

> The radix tree can be tuned to use order-0 allocations, and then your
> worries about fragmentation go away too.
> Paolo

Increasing the number might be reasonable for workloads such as nested
virt. But depending on this in userspace when you don't have to is not a
good idea IMHO.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at