RE: [virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
From: Li, Liang Z
Date: Thu Jul 28 2016 - 21:08:22 EST
> > > On Wed, Jul 27, 2016 at 09:03:21AM -0700, Dave Hansen wrote:
> > > > On 07/26/2016 06:23 PM, Liang Li wrote:
> > > > > + vb->pfn_limit = VIRTIO_BALLOON_PFNS_LIMIT;
> > > > > + vb->pfn_limit = min(vb->pfn_limit, get_max_pfn());
> > > > > + vb->bmap_len = ALIGN(vb->pfn_limit, BITS_PER_LONG) /
> > > > > + BITS_PER_BYTE + 2 * sizeof(unsigned long);
> > > > > + hdr_len = sizeof(struct balloon_bmap_hdr);
> > > > > + vb->bmap_hdr = kzalloc(hdr_len + vb->bmap_len,
> GFP_KERNEL);
> > > >
> > > > This ends up doing a 1MB kmalloc() right? That seems a _bit_ big.
> > > > How big was the pfn buffer before?
> > >
> > >
> > > Yes I would limit this to 1G memory in a go, will result in a 32KByte bitmap.
> > >
> > > --
> > > MST
> >
> > Limit to 1G is bad for the performance, I sent you the test result several
> weeks ago.
> >
> > Paste it bellow:
> > ----------------------------------------------------------------------
> > --------------------------------------------------
> > About the size of page bitmap, I have test the performance of filling
> > the balloon to 15GB with a 16GB RAM VM.
> >
> > ===============================
> > 32K Byte (cover 1GB of RAM)
> >
> > Time spends on inflating: 2031ms
> > ---------------------------------------------
> > 64K Byte (cover 2GB of RAM)
> >
> > Time spends on inflating: 1507ms
> > --------------------------------------------
> > 512K Byte (cover 16GB of RAM)
> >
> > Time spends on inflating: 1237ms
> > ================================
> >
> > If possible, a big bitmap is better for performance.
> >
> > Liang
>
> Earlier you said:
> a. allocating pages (6.5%)
> b. sending PFNs to host (68.3%)
> c. address translation (6.1%)
> d. madvise (19%)
>
> Here sending PFNs to host with 512K Byte map should be almost free.
>
> So is something else taking up the time?
>
I just want to show you the benefits of using a big bitmap. :)
I did not measure the time spend on each stage after optimization(I will do it later),
but I have tried to allocate the page with big chunk and found it can make things faster.
Without allocating big chunk page, the performance improvement is about 85%, and with
allocating big chunk page, the improvement is about 94%.
Liang
>
> --
> MST