Re: [PATCH] virtio_balloon: fix shrinker pages_to_free calculation

From: Wei Wang
Date: Mon Nov 18 2019 - 02:35:27 EST


On 11/18/2019 01:30 PM, Michael S. Tsirkin wrote:
On Mon, Nov 18, 2019 at 12:01:08PM +0800, Wei Wang wrote:
On 11/16/2019 06:55 AM, Khazhismel Kumykov wrote:
To my reading, we're accumulating total freed pages in pages_freed, but
subtracting it every iteration from pages_to_free, meaning we'll count
earlier iterations multiple times, freeing fewer pages than expected.
Just accumulate in pages_freed, and compare to pages_to_free.
Not sure about the above. But the following unit mismatch is a good capture,
thanks!

There's also a unit mismatch, where pages_to_free seems to be virtio
balloon pages, and pages_freed is system pages (We divide by
VIRTIO_BALLOON_PAGES_PER_PAGE), so sutracting pages_freed from
pages_to_free may result in freeing too much.

There also seems to be a mismatch between shrink_free_pages() and
shrink_balloon_pages(), where in both pages_to_free is given as # of
virtio pages to free, but free_pages() returns virtio pages, and
balloon_pages returns system pages.

(For 4K PAGE_SIZE, this mismatch wouldn't be noticed since
VIRTIO_BALLOON_PAGES_PER_PAGE would be 1)

Have both return virtio pages, and divide into system pages when
returning from shrinker_scan()
Sounds good.

Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
Cc: Wei Wang <wei.w.wang@xxxxxxxxx>
Signed-off-by: Khazhismel Kumykov <khazhy@xxxxxxxxxx>
---

Tested this under memory pressure conditions and the shrinker seemed to
shrink.

drivers/virtio/virtio_balloon.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 226fbb995fb0..7951ece3fe24 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -782,11 +782,8 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
* VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
* multiple times to deflate pages till reaching pages_to_free.
*/
- while (vb->num_pages && pages_to_free) {
- pages_freed += leak_balloon(vb, pages_to_free) /
- VIRTIO_BALLOON_PAGES_PER_PAGE;
- pages_to_free -= pages_freed;
- }
+ while (vb->num_pages && pages_to_free > pages_freed)
+ pages_freed += leak_balloon(vb, pages_to_free - pages_freed);
update_balloon_size(vb);
return pages_freed;
@@ -805,11 +802,11 @@ static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker,
pages_freed = shrink_free_pages(vb, pages_to_free);
We also need a fix here then:

pages_freed = shrink_free_pages(vb, sc->nr_to_scan) *
VIRTIO_BALLOON_PAGES_PER_PAGE;
No let's do accounting in pages please. virtio page is a legacy
thing we just did not fix it in time to get rid of it by now.

Btw, there is another mistake, in virtio_balloon_shrinker_count:

- count += vb->num_free_page_blocks >> VIRTIO_BALLOON_FREE_PAGE_ORDER;
+ count += vb->num_free_page_blocks << VIRTIO_BALLOON_FREE_PAGE_ORDER;

You may want to include it in this fix patch as well.
OMG. should be a separate patch.
But really this just shows why shifts are such a bad idea.

Let's define
VIRTIO_BALLOON_PAGES_PER_FREE_PAGE

and use it with * and / consistently instead of shifts.


OK, will do (maybe call it VIRTIO_BALLOON_FREE_PAGES_PER_BLOCK).

Best,
Wei