Re: [PATCH v16 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

From: Tetsuo Handa
Date: Tue Oct 10 2017 - 07:08:46 EST


Wei Wang wrote:
> On 10/09/2017 11:20 PM, Michael S. Tsirkin wrote:
> > On Sat, Sep 30, 2017 at 12:05:52PM +0800, Wei Wang wrote:
> >> +static inline void xb_set_page(struct virtio_balloon *vb,
> >> + struct page *page,
> >> + unsigned long *pfn_min,
> >> + unsigned long *pfn_max)
> >> +{
> >> + unsigned long pfn = page_to_pfn(page);
> >> +
> >> + *pfn_min = min(pfn, *pfn_min);
> >> + *pfn_max = max(pfn, *pfn_max);
> >> + xb_preload(GFP_KERNEL);
> >> + xb_set_bit(&vb->page_xb, pfn);
> >> + xb_preload_end();
> >> +}
> >> +
> > So, this will allocate memory
> >
> > ...
> >
> >> @@ -198,9 +327,12 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
> >> struct page *page;
> >> struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
> >> LIST_HEAD(pages);
> >> + bool use_sg = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_SG);
> >> + unsigned long pfn_max = 0, pfn_min = ULONG_MAX;
> >>
> >> - /* We can only do one array worth at a time. */
> >> - num = min(num, ARRAY_SIZE(vb->pfns));
> >> + /* Traditionally, we can only do one array worth at a time. */
> >> + if (!use_sg)
> >> + num = min(num, ARRAY_SIZE(vb->pfns));
> >>
> >> mutex_lock(&vb->balloon_lock);
> >> /* We can't release more pages than taken */
> > And is sometimes called on OOM.
> >
> >
> > I suspect we need to
> >
> > 1. keep around some memory for leak on oom
> >
> > 2. for non oom allocate outside locks
> >
> >
>
> I think maybe we can optimize the existing balloon logic, which could
> remove the big balloon lock:
>
> It would not be necessary to have the inflating and deflating run at the
> same time.
> For example, 1st request to inflate 7G RAM, when 1GB has been given to
> the host (so 6G left), the
> 2nd request to deflate 5G is received. Instead of waiting for the 1st
> request to inflate 6G and then
> continuing with the 2nd request to deflate 5G, we can do a diff (6G to
> inflate - 5G to deflate) immediately,
> and got 1G to inflate. In this way, all that driver will do is to simply
> inflate another 1G.
>
> Same for the OOM case: when OOM asks for 1G, while inflating 5G is in
> progress, then the driver can
> deduct 1G from the amount that needs to inflate, and as a result, it
> will inflate 4G.
>
> In this case, we will never have the inflating and deflating task run at
> the same time, so I think it is
> possible to remove the lock, and therefore, we will not have that
> deadlock issue.
>
> What would you guys think?

What is balloon_lock at virtballoon_migratepage() for?

e22504296d4f64fb "virtio_balloon: introduce migration primitives to balloon pages"
f68b992bbb474641 "virtio_balloon: fix race by fill and leak"

And even if we could remove balloon_lock, you still cannot use
__GFP_DIRECT_RECLAIM at xb_set_page(). I think you will need to use
"whether it is safe to wait" flag from
"[PATCH] virtio: avoid possible OOM lockup at virtballoon_oom_notify()" .