Re: [QUICKLIST 0/4] Arch independent quicklists V2

From: Nick Piggin
Date: Tue Mar 13 2007 - 07:30:39 EST


Andrew Morton wrote:
On Tue, 13 Mar 2007 22:06:46 +1100 Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
Andrew Morton wrote:

On Tue, 13 Mar 2007 19:03:38 +1100 Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:

...


Page allocator still requires interrupts to be disabled, which this doesn't.


it is worthwhile.


If you want a zeroed page for pagecache and someone has just stuffed a
known-zero, cache-hot page into the pagetable quicklists, you have good
reason to be upset.

The thing is, pagetable pages are the one really good exception to the
rule that we should keep cache hot and initialise-on-demand. They
typically are fairly sparsely populated and sparsely accessed. Even
for last level page tables, I think it is reasonable to assume they will
usually be pretty cold.


eh? I'd have thought that a pte page which has just gone through
zap_pte_range() will very often have a _lot_ of hot cachelines, and
that's a common case.

Still. It's pretty easy to test.

Well I guess that would be the case if you had just unmapped a 4MB
chunk that was pretty dense with pages.

My malloc seems to allocate and free in blocks of 128K, so that's
only going to give us 3% of the last level pte being cache hot when
it gets freed. Not sure what common mmap(file) access patterns
look like.

The majority of programs I run have a smattering of llpt pages
pretty sparsely populated, covering text, libraries, heap, stack,
vdso.

We don't actually have to zap_pte_range the entire page table in
order to free it (IIRC we used to have to, before the 4lpt patches).

But yeah let's see some tests. I would definitely want to avoid this
extra layer of complexity if it is just as good to return the pages
to the pcp lists.

Maybe, dunno. It was apparently a win on powerpc many years ago. I had a
fiddle with it 5-6 years ago on x86 using a cache-disabled mapping of the
page. But it needed too much support in core VM to bother. Since then
we've grown per-cpu page magazines and __GFP_ZERO. Plus I'm not aware of
anyone having tried doing it on x86 with non-temporal stores.

You can win on specifically constructed benchmarks, easily.

But considering all the other problems you're going to introduce, we'd need
a significant win on a significant something, IMO.

You waste memory bandwidth. You also use more CPU and memory cycles
speculatively, ergo you waste more power.


Yeah, prezeroing in idle is probably pointless. But I'm not aware of
anyone having tried it properly...

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/