Re: PG_zero

From: Andrea Arcangeli
Date: Mon Nov 01 2004 - 21:58:20 EST


On Sun, Oct 31, 2004 at 07:35:03AM -0800, Martin J. Bligh wrote:
> I'll non-micro-benchmark stuff for you on big machines if you want, but I've
> wasted enough time coding this stuff already ;-)
>
> BTW, the one really useful thing the whole page zeroing stuff did was to
> shift the profiled cost of page zeroing out to the routine acutally using
> the pages, as it's no longer just do_anonymous_page taking the cache hit.

I'm not a big fan of the idle zeroing myself despite I implemented it.
The only performance difference I can measure is the microbenchmark
running 3 times faster, everything else seems the same.

However the real point of the patch is to address all other issues with
the per-cpu list and to fixup the lack of pte_quicklist in 2.6, and to
avoid wasting zeropages (like the pte_quicklists did) by sharing them
with all other page-zero users.

I'm fine to drop the idle zeroing stuff, it just took another half an
hour to add it so I did it just in case (it can be already disabled via
sysctl of course).

btw, how did you implement idle zeroing? had you two per-cpu lists,
where the idle zeroing only refile across the two per-cpu lists like I
did? Did you address the COW with zeropage? the design I did for it is
the only one I could imagine beneficial at all, I would never attempt
taking any spinlock on the idle task. No idea if you also were just
refiling pages across two per-cpu lists. I need two lists anyways, one
is the hot-cold one (shared to avoid wasting memory like 2.6 mainline)
and one is the zero-list that is needed to avoid the lack of
pte_quicklists and to cache the PG_zero info (it actually renders
pte_quicklists obsolete since they're less optimal at utilizing zero
resources). Plus the patch fixes other issues like trying all per-cpu
lists in the classzone before falling back to the buddy allocator. It
removes the low watermark and other minor details. The idle zeroing is
just a minor thing in the whole patch, the point of PG_zero is to create
an infrastructure in the main page allocator that is capable of caching
already zero data like ptes.

So far this in practice has been an improvement or a noop, and it solves
various theoretical issues too, but I still need to test on some bigger
box and see if it makes any difference there or not... But since it's
rock solid here, it was good enough for posting ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/