Re: [PATCH] virtio_balloon: fix shrinker pages_to_free calculation

From: Michael S. Tsirkin
Date: Mon Nov 18 2019 - 00:30:39 EST


On Mon, Nov 18, 2019 at 12:01:08PM +0800, Wei Wang wrote:
> On 11/16/2019 06:55 AM, Khazhismel Kumykov wrote:
> > To my reading, we're accumulating total freed pages in pages_freed, but
> > subtracting it every iteration from pages_to_free, meaning we'll count
> > earlier iterations multiple times, freeing fewer pages than expected.
> > Just accumulate in pages_freed, and compare to pages_to_free.
>
> Not sure about the above. But the following unit mismatch is a good capture,
> thanks!
>
> >
> > There's also a unit mismatch, where pages_to_free seems to be virtio
> > balloon pages, and pages_freed is system pages (We divide by
> > VIRTIO_BALLOON_PAGES_PER_PAGE), so sutracting pages_freed from
> > pages_to_free may result in freeing too much.
> >
> > There also seems to be a mismatch between shrink_free_pages() and
> > shrink_balloon_pages(), where in both pages_to_free is given as # of
> > virtio pages to free, but free_pages() returns virtio pages, and
> > balloon_pages returns system pages.
> >
> > (For 4K PAGE_SIZE, this mismatch wouldn't be noticed since
> > VIRTIO_BALLOON_PAGES_PER_PAGE would be 1)
> >
> > Have both return virtio pages, and divide into system pages when
> > returning from shrinker_scan()
>
> Sounds good.
>
> >
> > Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
> > Cc: Wei Wang <wei.w.wang@xxxxxxxxx>
> > Signed-off-by: Khazhismel Kumykov <khazhy@xxxxxxxxxx>
> > ---
> >
> > Tested this under memory pressure conditions and the shrinker seemed to
> > shrink.
> >
> > drivers/virtio/virtio_balloon.c | 11 ++++-------
> > 1 file changed, 4 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> > index 226fbb995fb0..7951ece3fe24 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -782,11 +782,8 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
> > * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> > * multiple times to deflate pages till reaching pages_to_free.
> > */
> > - while (vb->num_pages && pages_to_free) {
> > - pages_freed += leak_balloon(vb, pages_to_free) /
> > - VIRTIO_BALLOON_PAGES_PER_PAGE;
> > - pages_to_free -= pages_freed;
> > - }
> > + while (vb->num_pages && pages_to_free > pages_freed)
> > + pages_freed += leak_balloon(vb, pages_to_free - pages_freed);
> > update_balloon_size(vb);
> > return pages_freed;
> > @@ -805,11 +802,11 @@ static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker,
> > pages_freed = shrink_free_pages(vb, pages_to_free);
>
> We also need a fix here then:
>
> pages_freed = shrink_free_pages(vb, sc->nr_to_scan) *
> VIRTIO_BALLOON_PAGES_PER_PAGE;

No let's do accounting in pages please. virtio page is a legacy
thing we just did not fix it in time to get rid of it by now.

>
> Btw, there is another mistake, in virtio_balloon_shrinker_count:
>
> - count += vb->num_free_page_blocks >> VIRTIO_BALLOON_FREE_PAGE_ORDER;
> + count += vb->num_free_page_blocks << VIRTIO_BALLOON_FREE_PAGE_ORDER;
>
> You may want to include it in this fix patch as well.

OMG. should be a separate patch.
But really this just shows why shifts are such a bad idea.

Let's define
VIRTIO_BALLOON_PAGES_PER_FREE_PAGE

and use it with * and / consistently instead of shifts.


> Best,
> Wei
>