New lockless pagecache

From: Nick Piggin
Date: Fri Sep 02 2005 - 01:26:13 EST


There were a number of problems with the old lockless pagecache:

- page_count of free pages was unstable, which meant an extra atomic
operation when allocating a new page (had to use get_page instead
of set_page_count).

- This meant a new page flag PG_free had to be introduced.

- Also needed a spin loop and memory barriers added to the page
allocator to prevent a free page with a speculative reference on
it from being allocated.

- This introduced the requirement that interrupts be disabled in
page_cache_get_speculative to prevent deadlock, which was very
disheartening.

- Too complex.

- To cap it off, there was an "unsolvable" race whereby a second
page_cache_get_speculative would not be able to detect it had
picked up a free page, and try to free it again.

The elegant solution was right under my nose the whole time.
I introduced an atomic_cmpxchg and use that to ensure we don't
touch ->_count of a free page. This solves all the above problems.

I think this is getting pretty stable. No guarantees of course,
but it would be great if anyone gave it a test.

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/