Re: [PATCH 3/5] vhost: support upto 509 memory regions
From: Michael S. Tsirkin
Date: Wed Jun 17 2015 - 06:11:27 EST
On Wed, Jun 17, 2015 at 10:54:21AM +0200, Igor Mammedov wrote:
> On Wed, 17 Jun 2015 09:39:06 +0200
> "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
>
> > On Wed, Jun 17, 2015 at 09:28:02AM +0200, Igor Mammedov wrote:
> > > On Wed, 17 Jun 2015 08:34:26 +0200
> > > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
> > >
> > > > On Wed, Jun 17, 2015 at 12:00:56AM +0200, Igor Mammedov wrote:
> > > > > On Tue, 16 Jun 2015 23:14:20 +0200
> > > > > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
> > > > >
> > > > > > On Tue, Jun 16, 2015 at 06:33:37PM +0200, Igor Mammedov wrote:
> > > > > > > since commit
> > > > > > > 1d4e7e3 kvm: x86: increase user memory slots to 509
> > > > > > >
> > > > > > > it became possible to use a bigger amount of memory
> > > > > > > slots, which is used by memory hotplug for
> > > > > > > registering hotplugged memory.
> > > > > > > However QEMU crashes if it's used with more than ~60
> > > > > > > pc-dimm devices and vhost-net since host kernel
> > > > > > > in module vhost-net refuses to accept more than 65
> > > > > > > memory regions.
> > > > > > >
> > > > > > > Increase VHOST_MEMORY_MAX_NREGIONS from 65 to 509
> > > > > >
> > > > > > It was 64, not 65.
> > > > > >
> > > > > > > to match KVM_USER_MEM_SLOTS fixes issue for vhost-net.
> > > > > > >
> > > > > > > Signed-off-by: Igor Mammedov <imammedo@xxxxxxxxxx>
> > > > > >
> > > > > > Still thinking about this: can you reorder this to
> > > > > > be the last patch in the series please?
> > > > > sure
> > > > >
> > > > > >
> > > > > > Also - 509?
> > > > > userspace memory slots in terms of KVM, I made it match
> > > > > KVM's allotment of memory slots for userspace side.
> > > >
> > > > Maybe KVM has its reasons for this #. I don't see
> > > > why we need to match this exactly.
> > > np, I can cap it at safe 300 slots but it's unlikely that it
> > > would take cut off 1 extra hop since it's capped by QEMU
> > > at 256+[initial fragmented memory]
> >
> > But what's the point? We allocate 32 bytes per slot.
> > 300*32 = 9600 which is more than 8K, so we are doing
> > an order-3 allocation anyway.
> > If we could cap it at 8K (256 slots) that would make sense
> > since we could avoid wasting vmalloc space.
> 256 is amount of hotpluggable slots and there is no way
> to predict how initial memory would be fragmented
> (i.e. amount of slots it would take), if we guess wrong
> we are back to square one with crashing userspace.
> So I'd stay consistent with KVM's limit 509 since
> it's only limit, i.e. not actual amount of allocated slots.
>
> > I'm still not very happy with the whole approach,
> > giving userspace ability allocate 4 whole pages
> > of kernel memory like this.
> I'm working in parallel so that userspace won't take so
> many slots but it won't prevent its current versions
> crashing due to kernel limitation.
Right but at least it's not a regression. If we promise userspace to
support a ton of regions, we can't take it back later, and I'm concerned
about the memory usage.
I think it's already safe to merge the binary lookup patches, and maybe
cache and vmalloc, so that the remaining patch will be small.
>
> > > > > > I think if we are changing this, it'd be nice to
> > > > > > create a way for userspace to discover the support
> > > > > > and the # of regions supported.
> > > > > That was my first idea before extending KVM's memslots
> > > > > to teach kernel to tell qemu this number so that QEMU
> > > > > at least would be able to check if new memory slot could
> > > > > be added but I was redirected to a more simple solution
> > > > > of just extending vs everdoing things.
> > > > > Currently QEMU supports upto ~250 memslots so 509
> > > > > is about twice high we need it so it should work for near
> > > > > future
> > > >
> > > > Yes but old kernels are still around. Would be nice if you
> > > > can detect them.
> > > >
> > > > > but eventually we might still teach kernel and QEMU
> > > > > to make things more robust.
> > > >
> > > > A new ioctl would be easy to add, I think it's a good
> > > > idea generally.
> > > I can try to do something like this on top of this series.
> > >
> > > >
> > > > > >
> > > > > >
> > > > > > > ---
> > > > > > > drivers/vhost/vhost.c | 2 +-
> > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > > > > > index 99931a0..6a18c92 100644
> > > > > > > --- a/drivers/vhost/vhost.c
> > > > > > > +++ b/drivers/vhost/vhost.c
> > > > > > > @@ -30,7 +30,7 @@
> > > > > > > #include "vhost.h"
> > > > > > >
> > > > > > > enum {
> > > > > > > - VHOST_MEMORY_MAX_NREGIONS = 64,
> > > > > > > + VHOST_MEMORY_MAX_NREGIONS = 509,
> > > > > > > VHOST_MEMORY_F_LOG = 0x1,
> > > > > > > };
> > > > > > >
> > > > > > > --
> > > > > > > 1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/