Re: [PATCH][2.6-mm] Readahead issues and AIO read speedup

From: Badari Pulavarty (pbadari@us.ibm.com)
Date: Thu Aug 07 2003 - 11:01:01 EST


Suparna,

I noticed the exact same thing while testing on database benchmark
on filesystems (without AIO). I added instrumentation in scsi layer to
record the IO pattern and I found that we are doing lots of (4million)
4K reads, in my benchmark run. I was tracing that and found that all
those reads are generated by slow read path, since readahead window
is maximally shrunk. When I forced the readahead code to read 16k
(my database pagesize), in case ra window closed - I see 20% improvement
in my benchmark. I asked "Ramchandra Pai" (linuxram@us.ibm.com)
to investigate it further.

Thanks,
Badari

On Thursday 07 August 2003 03:01 am, Suparna Bhattacharya wrote:
> I noticed a problem with the way do_generic_mapping_read
> and readahead works for the case of large reads, especially
> random reads. This was leading to very inefficient behaviour
> for a stream for AIO reads. (See the results a little later
> in this note)
>
> 1) We should be reading ahead at least the pages that are
> required by the current read request (even if the ra window
> is maximally shrunk). I think I've seen this in 2.4 - we
> seem to have lost that in 2.5.
> The result is that sometimes (large random reads) we end
> up doing reads one page at a time waiting for it to complete
> being reading the next page and so on, even for a large read.
> (until we buildup a readahead window again)
>
> 2) Once the ra window is maximally shrunk, the responsibility
> for reading the pages and re-building the window is shifted
> to the slow path in read, which breaks down in the case of
> a stream of AIO reads where multiple iocbs submit reads
> to the same file rather than serialise the wait for i/o
> completion.
>
> So here is a patch that fixes this by making sure we do
> (1) and pushing up the handle_ra_miss calls for the maximally
> shrunk case before the loop that waits for I/O completion.
>
> Does it make a difference ? A lot, actually.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 07 2003 - 22:00:39 EST