Re: [PATCH] Radix-tree pagecache for 2.5

From: Ingo Molnar (mingo@elte.hu)
Date: Fri Feb 01 2002 - 05:29:53 EST


On 1 Feb 2002, Momchil Velikov wrote:

> Hmm, worse, yes, the same way as page tables get "worse" with larger
> address spaces.

with the difference that for address spaces one of the preferred methods
of operation is read() [or sendfile(), or any other non-mmap() operation],
while for pagetables the hardware helps to get locking-free access to the
mapped contents.

> Ingo> big files. The thing i'm worried about is the 'big pagecache lock' being
> Ingo> reintroduced again. If eg. a database application puts lots of data into a
>
> Yes, though I'd strongly suspect big database engines can/should/do
> benefit from doing their application specific caching and indexing,
> outperforming whatever cache implementation the OS has.

it's not just databases. It's webservers too, serving content via
sendfile() from a single, bigger file. Think streaming media servers,
where the 'movie of the night' sits in a single big binary glob.

> Ingo> single file (multiple gigabytes - why not), then the
> mapping-> i_shared_lock becomes a 'big pagecache lock' again, causing
> Ingo> serious SMP contention for even the read() case. Benchmarks show that it's
> Ingo> the distribution of locks that matters on big boxes.
>
> So, we can use a read-write spinlock instead ->i_shared_lock, ok ?

using read-write locks does not solve the scalability problem: the problem
is the bouncing of the spinlock cacheline from CPU to CPU.

        Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Feb 07 2002 - 21:00:11 EST