Re: PG_zero

From: Andrea Arcangeli
Date: Mon Nov 01 2004 - 21:08:20 EST


On Tue, Nov 02, 2004 at 04:26:02AM +1100, Nick Piggin wrote:
> Not sure if this is wise. Reclaimed pages should definitely be cache
> cold. Other freeing is assumed cache hot and LRU ordered on the hot
> list which seems right... but I think you want the cold list for page
> reclaim, don't you?

I retain hot and cold info in the same list. That avoids wasting ram.

I call it hot-cold quicklist.

> Could the API be made nicer by clearing the page for you if it didn't
> find a PG_zero page?

That's what get_zeroed_page is already doing with my patch applied, I
natively and trasparently support the feature via the API already
existing for it (and it also makes sure to clear PG_zero since there's
no guarantee that the user will free the page after clearing it again).

> I have the feeling that it might not be worthwhile doing zero on idle.

yes, it only boost 200% one workload, for the rest it seems a noop, but
I'm very far from being memory bound on my test boxes. the sysctl will
disable it at runtime, we can make it enabled or disabled by default by
tweaking the static setting to 1 or to 0 at compile time.

> You've got chance of blowing the cache on zeroing pages that won't be
> used for a while. If you do uncached writes then you've changed the
> problem to memory bandwidth (I guess doesn't matter much on UP).

normally the cpu is really perfectly idle, only in a few cases we clear
pages and it starts clearing the hot caches first. Plus the refiling may
find already PG_zero pages from the buddy and it will not do any
clearpage at all.

> It seems like a good idea to do zero pages in the page allocator if
> at all (rather than slab), but I guess you don't want to complicate
> it unless it shows improvements in macro benchmarks.
>
> Sorry my feedback isn't much, it is not based on previous experience.

I serously doubt any previous experience is any similar to what I've
implemented, but I completely missed any previous experience (probably
because I was doing mostly 2.4 work when you were developing 2.5), so I
may be completely wrong, but certainly lots of things are going wrong in
the current per-cpu lists and those are all fixed, plus this is mostly
for the pte_quicklist that has been dropped by mistake from 2.6 (just
got a compliant for it from SGI saying latency is too high, and that's
what driven me at doing PG_zero, to close that bug but underway I found
many more wrong things, and eventually I figured out the only design I
would ever let my cpu use to do idle zeroing on top of what I already
implemented, but I certainly agree that's the least interesting part of
the patch).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/