Re: unified page and buffer cache?

From: Matthew Wilcox
Date: Fri May 07 2010 - 14:30:11 EST


On Fri, May 07, 2010 at 11:45:34AM -0400, Phillip Susi wrote:
> On 5/7/2010 9:53 AM, Matthew Wilcox wrote:
> > The problem you're seeing is aliasing in the page cache, not a failed
> > unification of the buffer and page caches. Pages are addressed by
> > (mapping, offset). Each inode generally has its own mapping. Depending
> > on the file system, directories may be addressed by their own inode's
> > mapping, or by the block device's mapping.
>
> If there are two mappings that don't know about each other, then the
> caches don't seem very unified to me. If I write to the file and that
> data sits in the mapping for the inode, then I read the corresponding
> blocks though the block device, and it has a different mapping, then I
> read the old data off the disk instead of the new data in the cache. I
> thought that this exact problem had been fixed long ago.

A long time ago (was this 2.0? 1.2?) the buffer cache and the page cache
were actually separate caches. Then the buffer cache was rewritten to
point into the page cache, and we were all grateful.

As I said, you're seeing something completely different. The page cache
is virtually indexed, not physically indexed. As generations of CPU
designers have found, that's a lot faster, but aliases can bite you.

> > Resolving aliasing would be horribly expensive, so it's unlikely to
> > happen.
>
> Back to the drawing board I guess. Maybe ext could be fixed to use an
> inode mapping for directories instead of relying on the block device
> mapping, then I could readahead() the directory instead of having to go
> to the block device at all.

That would be possible, but would waste memory space. But we've all
got gigabytes of ram these days, maybe nobody cares.

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/