Re: [PATCH v4.6-rc] writeback: Fix performance regression in wb_over_bg_thresh()

From: Sedat Dilek
Date: Thu May 05 2016 - 17:43:47 EST


On 5/5/16, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> Hi Jens,
>
> This fix seems to have been missed; it should go into v4.6.
>
> Please apply.
>
> Thanks,
> Miklos
>
>
> From: Howard Cochran <hcochran@xxxxxxxxxxxxxxxx>
> Subject: writeback: Fix performance regression in wb_over_bg_thresh()
> Date: Thu, 10 Mar 2016 01:12:39 -0500
>
> Commit 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use
> wb_domain aware operations") unintentionally changed this function's
> meaning from "are there more dirty pages than the background writeback
> threshold" to "are there more dirty pages than the writeback threshold".
> The background writeback threshold is typically half of the writeback
> threshold, so this had the effect of raising the number of dirty pages
> required to cause a writeback worker to perform background writeout.
>
> This can cause a very severe performance regression when a BDI uses
> BDI_CAP_STRICTLIMIT because balance_dirty_pages() and the writeback worker
> can now disagree on whether writeback should be initiated.
>
> For example, in a system having 1GB of RAM, a single spinning disk, and a
> "pass-through" FUSE filesystem mounted over the disk, application code
> mmapped a 128MB file on the disk and was randomly dirtying pages in that
> mapping.
>
> Because FUSE uses strictlimit and has a default max_ratio of only 1%, in
> balance_dirty_pages, thresh is ~200, bg_thresh is ~100, and the
> dirty_freerun_ceiling is the average of those, ~150. So, it pauses the
> dirtying processes when we have 151 dirty pages and wakes up a background
> writeback worker. But the worker tests the wrong threshold (200 instead of
> 100), so it does not initiate writeback and just returns.
>
> Thus, balance_dirty_pages keeps looping, sleeping and then waking up the
> worker who will do nothing. It remains stuck in this state until the few
> dirty pages that we have finally expire and we write them back for that
> reason. Then the whole process repeats, resulting in near-zero throughput
> through the FUSE BDI.
>
> The fix is to call the parameterized variant of wb_calc_thresh, so that the
> worker will do writeback if the bg_thresh is exceeded which was the
> behavior before the referenced commit.
>
> Fixes: 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use wb_domain
> aware operations")
> Signed-off-by: Howard Cochran <hcochran@xxxxxxxxxxxxxxxx>
> Acked-by: Tejun Heo <tj@xxxxxxxxxx>
> Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx> # v4.2+

Fell free to add my...

Tested-by Sedat Dilek <sedat.dilek@xxxxxxxxx>

- sed@ -

> ---
> mm/page-writeback.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1910,7 +1910,8 @@ bool wb_over_bg_thresh(struct bdi_writeb
> if (gdtc->dirty > gdtc->bg_thresh)
> return true;
>
> - if (wb_stat(wb, WB_RECLAIMABLE) > __wb_calc_thresh(gdtc))
> + if (wb_stat(wb, WB_RECLAIMABLE) >
> + wb_calc_thresh(gdtc->wb, gdtc->bg_thresh))
> return true;
>
> if (mdtc) {
> @@ -1924,7 +1925,8 @@ bool wb_over_bg_thresh(struct bdi_writeb
> if (mdtc->dirty > mdtc->bg_thresh)
> return true;
>
> - if (wb_stat(wb, WB_RECLAIMABLE) > __wb_calc_thresh(mdtc))
> + if (wb_stat(wb, WB_RECLAIMABLE) >
> + wb_calc_thresh(mdtc->wb, mdtc->bg_thresh))
> return true;
> }
>
>