Re: [PATCH 06/13] writeback: bdi write bandwidth estimation

From: Wu Fengguang
Date: Thu Nov 18 2010 - 01:51:24 EST


On Thu, Nov 18, 2010 at 07:08:37AM +0800, Andrew Morton wrote:
> On Wed, 17 Nov 2010 12:27:26 +0800
> Wu Fengguang <fengguang.wu@xxxxxxxxx> wrote:
>
> > + w = min(elapsed / (HZ/100), 128UL);
>
> I did try setting HZ=10 many years ago, and the kernel blew up.
>
> I do recall hearing of people who set HZ very low, perhaps because
> their huge machines were seeing performance prolems when the timer tick
> went off. Probably there's no need to do that any more.
>
> But still, we shouldn't hard-wire the (HZ >= 100) assumption if we
> don't absolutely need to, and I don't think it is absolutely needed
> here.

Fair enough. Here is the fix. The other (HZ/10) will be addressed by
another patch that increase it to MAX_PAUSE=max(HZ/5, 1).

Thanks,
Fengguang
---

Subject: writeback: prevent divide error on tiny HZ
Date: Thu Nov 18 12:19:56 CST 2010

As suggested by Andrew and Peter:

I do recall hearing of people who set HZ very low, perhaps because their
huge machines were seeing performance prolems when the timer tick went
off. Probably there's no need to do that any more.

But still, we shouldn't hard-wire the (HZ >= 100) assumption if we don't
absolutely need to, and I don't think it is absolutely needed here.

People who do cpu bring-up on very slow FPGAs also lower HZ as far as
possible.

CC: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
---
mm/page-writeback.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- linux-next.orig/mm/page-writeback.c 2010-11-18 12:35:18.000000000 +0800
+++ linux-next/mm/page-writeback.c 2010-11-18 12:35:38.000000000 +0800
@@ -490,6 +490,7 @@ void bdi_update_write_bandwidth(struct b
unsigned long *bw_time,
s64 *bw_written)
{
+ const unsigned long unit_time = max(HZ/100, 1);
unsigned long written;
unsigned long elapsed;
unsigned long bw;
@@ -499,7 +500,7 @@ void bdi_update_write_bandwidth(struct b
goto snapshot;

elapsed = jiffies - *bw_time;
- if (elapsed < HZ/100)
+ if (elapsed < unit_time)
return;

/*
@@ -513,7 +514,7 @@ void bdi_update_write_bandwidth(struct b

written = percpu_counter_read(&bdi->bdi_stat[BDI_WRITTEN]) - *bw_written;
bw = (HZ * PAGE_CACHE_SIZE * written + elapsed/2) / elapsed;
- w = min(elapsed / (HZ/100), 128UL);
+ w = min(elapsed / unit_time, 128UL);
bdi->write_bandwidth = (bdi->write_bandwidth * (1024-w) + bw * w) >> 10;
bdi->write_bandwidth_update_time = jiffies;
snapshot:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/