Re: [PATCH] virtio_balloon: use non-blocking allocation
From: Michael S. Tsirkin
Date: Tue Jan 30 2018 - 19:01:38 EST
On Tue, Jan 02, 2018 at 11:50:21PM +0900, Tetsuo Handa wrote:
> Commit c7cdff0e864713a0 ("virtio_balloon: fix deadlock on OOM") tried to
> avoid OOM lockup by moving memory allocations to outside of balloon_lock.
>
> Now, Wei is trying to allocate far more pages outside of balloon_lock and
> some more memory inside of balloon_lock in order to perform efficient
> communication between host and guest using scatter-gather API.
>
> Since pages allocated outside of balloon_lock are not visible to the OOM
> notifier path until fill_balloon() holds balloon_lock (and enqueues the
> pending pages), allocating more pages than now may lead to unacceptably
> premature OOM killer invocation.
>
> It would be possible to make the pending pages visible to the OOM notifier
> path. But there is no need to try to allocate memory so hard from the
> beginning. As of commit 18468d93e53b037e ("mm: introduce a common
> interface for balloon pages mobility"), it made sense to try allocation
> as hard as possible. But after commit 5a10b7dbf904bfe0 ("virtio_balloon:
> free some memory from balloon on OOM"),
However, please not that this behavious is optional.
Can you keep the current behaviour when deflate on OOM is disabled?
> it no longer makes sense to try
> allocation as hard as possible, for fill_balloon() will after all have to
> release just allocated memory if some allocation request hits the OOM
> notifier path. Therefore, this patch disables __GFP_DIRECT_RECLAIM when
> allocating memory for inflating the balloon. Then, memory for inflating
> the balloon can be allocated inside balloon_lock, and we can release just
> allocated memory as needed.
>
> Also, this patch adds __GFP_NOWARN, for possibility of hitting memory
> allocation failure is increased by removing __GFP_DIRECT_RECLAIM, which
> might spam the kernel log buffer. At the same time, this patch moves
> "puff" messages to outside of balloon_lock, for it is not a good thing to
> block the OOM notifier path for 1/5 of a second. (Moreover, it is better
> to release the workqueue and allow processing other pending items. But
> that change is out of this patch's scope.)
>
> __GFP_NOMEMALLOC is currently not required because workqueue context
> which calls balloon_page_alloc() won't cause __gfp_pfmemalloc_flags()
> to return ALLOC_OOM. But since some process context might start calling
> balloon_page_alloc() in future, this patch does not remove
> __GFP_NOMEMALLOC.
>
> (Only compile tested. Please do runtime tests before committing.)
You will have to find someone to test it.
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Cc: Michael S. Tsirkin <mst@xxxxxxxxxx>
> Cc: Wei Wang <wei.w.wang@xxxxxxxxx>
> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxx>
> ---
> drivers/virtio/virtio_balloon.c | 23 +++++++++++++----------
> mm/balloon_compaction.c | 5 +++--
> 2 files changed, 16 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index dfe5684..4d9409b 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -141,7 +141,7 @@ static void set_page_pfns(struct virtio_balloon *vb,
> page_to_balloon_pfn(page) + i);
> }
>
> -static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> +static unsigned fill_balloon(struct virtio_balloon *vb, size_t num, bool *oom)
> {
> unsigned num_allocated_pages;
> unsigned num_pfns;
> @@ -151,24 +151,19 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> /* We can only do one array worth at a time. */
> num = min(num, ARRAY_SIZE(vb->pfns));
>
> + mutex_lock(&vb->balloon_lock);
> +
> for (num_pfns = 0; num_pfns < num;
> num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
> struct page *page = balloon_page_alloc();
>
> if (!page) {
> - dev_info_ratelimited(&vb->vdev->dev,
> - "Out of puff! Can't get %u pages\n",
> - VIRTIO_BALLOON_PAGES_PER_PAGE);
> - /* Sleep for at least 1/5 of a second before retry. */
> - msleep(200);
> + *oom = true;
> break;
> }
> -
> balloon_page_push(&pages, page);
> }
>
> - mutex_lock(&vb->balloon_lock);
> -
> vb->num_pfns = 0;
>
> while ((page = balloon_page_pop(&pages))) {
> @@ -404,17 +399,25 @@ static void update_balloon_size_func(struct work_struct *work)
> {
> struct virtio_balloon *vb;
> s64 diff;
> + bool oom = false;
>
> vb = container_of(work, struct virtio_balloon,
> update_balloon_size_work);
> diff = towards_target(vb);
>
> if (diff > 0)
> - diff -= fill_balloon(vb, diff);
> + diff -= fill_balloon(vb, diff, &oom);
> else if (diff < 0)
> diff += leak_balloon(vb, -diff);
> update_balloon_size(vb);
>
> + if (oom) {
> + dev_info_ratelimited(&vb->vdev->dev,
> + "Out of puff! Can't get %u pages\n",
> + VIRTIO_BALLOON_PAGES_PER_PAGE);
> + /* Sleep for at least 1/5 of a second before retry. */
> + msleep(200);
> + }
> if (diff)
> queue_work(system_freezable_wq, work);
> }
> diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
> index ef858d5..067df56 100644
> --- a/mm/balloon_compaction.c
> +++ b/mm/balloon_compaction.c
> @@ -21,8 +21,9 @@
> */
> struct page *balloon_page_alloc(void)
> {
> - struct page *page = alloc_page(balloon_mapping_gfp_mask() |
> - __GFP_NOMEMALLOC | __GFP_NORETRY);
> + struct page *page = alloc_page((balloon_mapping_gfp_mask() |
> + __GFP_NOMEMALLOC | __GFP_NOWARN) &
> + ~__GFP_DIRECT_RECLAIM);
> return page;
> }
> EXPORT_SYMBOL_GPL(balloon_page_alloc);
> --
> 1.8.3.1