Re: Random file I/O regressions in 2.6
From: Bill Davidsen
Date: Tue May 11 2004 - 17:30:57 EST
Andrew Morton wrote:
Ram Pai <linuxram@xxxxxxxxxx> wrote:
I am nervous about this change. You are totally getting rid of
lazy-readahead and that was the optimization which gave the best
possible boost in performance.
Because it disabled the large readahead outside the area which the app is
reading. But it's still reading too much.
Let me see how this patch does with a DSS benchmark.
That was not a real patch. More work is surely needed to get that right.
In the normal large random workload this extra page would have
compesated for all the wasted readaheads.
I disagree that 64k is "normal"!
However in the case of
sysbench with Andrew's ra-copy patch the readahead calculation is not
happening quiet right. Is it worth trying to get a marginal gain
with sysbench at the cost of getting a big hit on DSS benchmarks,
aio-tests,iozone and probably others. Or am I making an unsubstantiated
claim? I will get back with results.
It shouldn't hurt at all - the app does a seek, we perform the
As I say, my main concern is that we correctly transition from seeky access
to linear access and resume readahead.
One real problem is that you are trying to do in the kernel what would
be best done in the application and better done in glibc... Because the
benefit of readahead varies based on fd rather than device. Consider a
program reading data from a file and putting it in a database. The
benefit of readahead for the sequential access data file is higher than
seek-read combinations. The library could do readahead based on the
bytes read since the last seek on a by-file basis, something the kernel
This is not to say the kernel work hasn't been a benefit, but note that
with all the patches 2.4 still seems to outperform 2.6. And that's a
problem since other parts of 2.6 scale so well. I do see that 2.4 seems
to outperform 2.6 for usenet news, where you have small reads against a
modest database, a few TB or so, and 400-2000 processes doing random
reads against the data. Settings and schedulers seem to have only modest
-bill davidsen (davidsen@xxxxxxx)
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/