Re: [PATCH] mm: make VM_MAX_READAHEAD configurable

From: Christian Ehrhardt
Date: Mon Oct 12 2009 - 01:54:28 EST


Wu Fengguang wrote:
Hi Martin,

On Fri, Oct 09, 2009 at 09:49:50PM +0800, Martin Schwidefsky wrote:
On Fri, 9 Oct 2009 14:29:52 +0200
Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:

On Fri, Oct 09 2009, Peter Zijlstra wrote:
On Fri, 2009-10-09 at 13:19 +0200, Ehrhardt Christian wrote:
From: Christian Ehrhardt <ehrhardt@xxxxxxxxxxxxxxxxxx>

On one hand the define VM_MAX_READAHEAD in include/linux/mm.h is just a default
and can be configured per block device queue.
On the other hand a lot of admins do not use it, therefore it is reasonable to
set a wise default.

This path allows to configure the value via Kconfig mechanisms and therefore
allow the assignment of different defaults dependent on other Kconfig symbols.

Using this, the patch increases the default max readahead for s390 improving
sequential throughput in a lot of scenarios with almost no drawbacks (only
theoretical workloads with a lot concurrent sequential read patterns on a very
low memory system suffer due to page cache trashing as expected).
[snip]
The patch from Christian fixes a performance regression in the latest
distributions for s390. So we would opt for a larger value, 512KB seems
to be a good one. I have no idea what that will do to the embedded
space which is why Christian choose to make it configurable. Clearly
the better solution would be some sort of system control that can be
modified at runtime.

May I ask for more details about your performance regression and why
it is related to readahead size? (we didn't change VM_MAX_READAHEAD..)
Sure, the performance regression appeared when comparing Novell SLES10 vs. SLES11.
While you are right Wu that the upstream default never changed so far, SLES10 had a
patch applied that set 512.

As mentioned before I didn't expect to get a generic 128->512 patch accepted,therefore
the configurable solution. But after Peter and Jens replied so quickly stating that
changing the default in kernel would be the wrong way to go I already looked out for
userspace alternatives. At least for my issues I could fix it with device specific udev rules
too.

And as Andrew mentioned the diversity of devices cause any default to be wrong for one
or another installation. To solve that the udev approach can also differ between different
device types (might be easier on s390 than on other architectures because I need to take
care of two disk types atm - and both shold get 512).

The testcase for anyone who wants to experiment with it is almost too easy, the biggest
impact can be seen with single thread iozone - I get ~40% better throughput when
increasing the readahead size to 512 (even bigger RA sizes don't help much in my
environment, probably due to fast devices).

--

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/