Re: [PATCH][5/?] count writeback pages in nr_scanned

From: Nick Piggin
Date: Wed Jan 05 2005 - 18:53:09 EST


Marcelo Tosatti wrote:
The caller would need to wait on all the zones which can satisfy the
caller's allocation request. A bit messy, although not rocket science. One would have to be careful to avoid additional CPU consumption due to
delivery of multiple wakeups at each I/O completion.

We should be able to demonstrate that such a change really fixes some
problem though. Otherwise, why bother?


Agreed. The current scheme works well enough, we dont have spurious OOM kills
anymore, which is the only "problem" such change ought to fix.

You might have performance increase in some situations I believe (because you have perzone waitqueues), but I agree its does not seem to be worth the
trouble.

I think what Andrea is worried about is that blk_congestion_wait is
fairly vague, and can be a source of instability in the scanning
implementation.

For example, if you have a heavy IO workload that is saturating your
disks, blk_congestion_wait may do the right thing and sleep until
they become uncongested and writeout can continue.

But at 2:00 am, when your backup job is trickling writes into another
block device, blk_congestion_wait returns much earlier, and before
many pages have been cleaned.

Bad example? Yeah maybe, but I think this is what Andrea is getting
at. Would it be a problem to replace those blk_congestion_waits with
unconditional io_schedule_timeout()s? That would be the dumb-but-more
-deterministic solution.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/