Re: Race in shrink_cache

From: Daniel Phillips (phillips@arcor.de)
Date: Thu Sep 05 2002 - 13:41:36 EST


On Thursday 05 September 2002 09:53, Andrew Morton wrote:
> Not having to bump page counts when moving pages from the LRU into a private
> list would be nice.

I'm not sure what your intended application is here. It's easy enough to
change the lru state bit to a scalar, the transitions of which are protected
naturally by the lru lock. This gives you N partitions of the lru list
(times M zones) and page_cache_release does the right thing for all of them.

On the other hand, if what you want is a private list that page_cache_release
doesn't act on automatically, all you have to do is set the lru state to zero,
leave the page count incremented and move to the private list. You then take
explicit responsibility for freeing the page or moving it back onto a
mainstream lru list.

An example of an application of the latter technique is a short delay list to
(finally) implement the use-once concept properly. A newly instantiated page
goes onto the hot end of this list instead of the inactive list as it does
now, and after a short delay dependent on the allocation activity in the
system, is moved either to the (per zone) active or inactive list, depending
on whether it was referenced. Thus the use-once list is not per-zone and
removal from it is always explicit. So it's not like the other lru lists,
even though it uses the same link field.

While I'm meandering here, I'll mention that the above approach finally makes
use-once work properly for swap pages, which always had the problem that we
couldn't detect the second, activating reference (and this was fudged by
always marking a swapped-in page as referenced, i.e., kludging away the
mechanism). It also solves the problem of detecting clustered references,
such as reading through a page a byte at a time, which should only count as a
single reference. Right now we do a stupid hack that works in a lot of
cases, but fails in enough cases to be annoying, all in the name of trying to
get by without implementing a dedicated list.

Readahead on the other hand needs to be handled with dedicated per-zone lru
lists, so that we can conveniently and accurately claw back readahead that
happens to have gone too far ahead. So this is an example of the first kind
of usage. Once read, a readahead page moves to the used-once queue, and from
there either to the inactive or active queue as above.

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 07 2002 - 22:00:26 EST