Re: clustering page-ins

Chuck Lever (cel@monkey.org)
Tue, 20 Jul 1999 14:05:35 -0400 (EDT)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Camm Maguire: "3c59x fatal timeout with 2.2.7-ac1"
Previous message: Klaus Kudielka: "Re: 2.2.10: PCMCIA ATA screws up DMA on /dev/hda"
Next in thread: Richard Black: "Re: PROBLEM: 2.2.5 unstable on Dell PC, 2.0.36 is stable"
Reply: Richard Black: "Re: PROBLEM: 2.2.5 unstable on Dell PC, 2.0.36 is stable"

On Tue, 20 Jul 1999, Rik van Riel wrote:
> Currently, we only do readahead of the cluster the requested
> page falls in (and only if the page is not in memory). This
> works rather well, but not good enough for all apps.
>
> By defining the top and bottom 1/8th of each zone a 'border'
> page and loading the next zone when a page in the border zone
> is loaded (AND the requested page is already in memory, meaning
> we have already done readahead on the current zone and it proved
> to be succesful), THEN we read in the zone nextdoor.

how do you prevent this algorithm from reading behind when a sequential
forward read goes through the bottom 1/8th of a zone that is already in
cache?

> Because the previous readahead was (apparently) succesful, we
> know we have a large chance of reading in data we will actually
> need PLUS the algorithm works for both read-ahead and read-behind
> (and zoned data files where the zones exceed cluster size).

as jamie says, it's easy to recognize strict monotonically increasing
sequential access to pages (i think this is what he meant by reading ahead
in the "perfect sequential" case) and aggressively read ahead in that
case. this might be overkill, but it seems like you'd need to actually
track the direction of the reads if you want to support read-behind when
reading backwards too. is reading backwards prevalent enough that special
logic should be added to watch for it? seems like read-behind is more
important in the random case, where you're paging in an executable or
library. and that's already taken care of with the current clustering
implementation.

> > > however, as the number of extra pages read during read-ahead increases,
> > > the overall efficiency of the system as a whole drops,
> >
> > On this I disagree. If a big disk read is not much slower than a small
> > disk read (i.e. a modern disk), but random head movement is slow, I'd
> > have thought the added clustering due to read-ahead would reduce the
> > total number of head movements when two applications are competing for
> > different areas of the disk.
>
> Clustering does indeed work in this way. I have observed
> (during a busy time of my emergeny packetstorm mirror) a
> VERY loaded 486/66 doing about 7000(!) pagefaults a second
> but 'only' 70 disk operations a second.

i was narrowly defining read-ahead as speculatively reading pages that may
be needed in the future. read-ahead doesn't necessarily mean reading in
large blocks, nor does it necessarily mean reading sequentially in order
to reduce seeks. clustering *causes* read-ahead, but also increases the
size of read operations. i think of clustering and read-ahead as
separate, but related, concepts. :)

- Chuck Lever

-- corporate: <chuckl@netscape.com> personal: <chucklever@netscape.net> or <cel@monkey.org>

The Linux Scalability project: http://www.citi.umich.edu/projects/linux-scalability/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/

Next message: Camm Maguire: "3c59x fatal timeout with 2.2.7-ac1"
Previous message: Klaus Kudielka: "Re: 2.2.10: PCMCIA ATA screws up DMA on /dev/hda"
Next in thread: Richard Black: "Re: PROBLEM: 2.2.5 unstable on Dell PC, 2.0.36 is stable"
Reply: Richard Black: "Re: PROBLEM: 2.2.5 unstable on Dell PC, 2.0.36 is stable"