Re: [PATCH 10/18] writeback: dirty position control - bdi reservearea

From: Peter Zijlstra
Date: Tue Sep 06 2011 - 10:10:25 EST


On Sun, 2011-09-04 at 09:53 +0800, Wu Fengguang wrote:
> plain text document attachment (bdi-reserve-area)
> Keep a minimal pool of dirty pages for each bdi, so that the disk IO
> queues won't underrun.
>
> It's particularly useful for JBOD and small memory system.
>
> Note that this is not enough when memory is really tight (in comparison
> to write bandwidth). It may result in (pos_ratio > 1) at the setpoint
> and push the dirty pages high. This is more or less intended because the
> bdi is in the danger of IO queue underflow. However the global dirty
> pages, when pushed close to limit, will eventually conteract our desire
> to push up the low bdi_dirty.
>
> In low memory JBOD tests we do see disks under-utilized from time to
> time. The additional fix may be to add a BDI_async_underrun flag to
> indicate that the block write queue is running low and it's time to
> quickly fill the queue by unthrottling the tasks regardless of the
> global limit.
>
> Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
> ---
> mm/page-writeback.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> --- linux-next.orig/mm/page-writeback.c 2011-08-26 20:12:19.000000000 +0800
> +++ linux-next/mm/page-writeback.c 2011-08-26 20:13:21.000000000 +0800
> @@ -487,6 +487,16 @@ unsigned long bdi_dirty_limit(struct bac
> * 0 +------------.------------------.----------------------*------------->
> * freerun^ setpoint^ limit^ dirty pages
> *
> + * (o) bdi reserve area
> + *
> + * The bdi reserve area tries to keep a reasonable number of dirty pages for
> + * preventing block queue underrun.
> + *
> + * reserve area, scale up rate as dirty pages drop low
> + * |<----------------------------------------------->|
> + * |-------------------------------------------------------*-------|----------
> + * 0 bdi setpoint^ ^bdi_thresh


So why not call the thing bdi freerun ?

> * (o) bdi control lines
> *
> * The control lines for the global/bdi setpoints both stretch up to @limit.
> @@ -634,6 +644,22 @@ static unsigned long bdi_position_ratio(
> pos_ratio *= x_intercept - bdi_dirty;
> do_div(pos_ratio, x_intercept - bdi_setpoint + 1);
>
> + /*
> + * bdi reserve area, safeguard against dirty pool underrun and disk idle
> + *
> + * It may push the desired control point of global dirty pages higher
> + * than setpoint. It's not necessary in single-bdi case because a
> + * minimal pool of @freerun dirty pages will already be guaranteed.
> + */
> + x_intercept = min(write_bw, freerun);
> + if (bdi_dirty < x_intercept) {

So the point of the freerun point is that we never throttle before it,
so basically all the below shouldn't be needed at all, right?

> + if (bdi_dirty > x_intercept / 8) {
> + pos_ratio *= x_intercept;
> + do_div(pos_ratio, bdi_dirty);
> + } else
> + pos_ratio *= 8;
> + }
> +
> return pos_ratio;
> }


So why not add:

if (likely(dirty < freerun))
return 2;

at the start of this function and leave it at that?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/