Re: Prezeroing V2 [0/3]: Why and When it works
From: Andrew Morton
Date: Thu Dec 23 2004 - 16:34:46 EST
Paul Mackerras <paulus@xxxxxxxxx> wrote:
> Christoph Lameter writes:
> > The most expensive operation in the page fault handler is (apart of SMP
> > locking overhead) the zeroing of the page.
> Re-reading this I see that you mean the zeroing of the page that is
> mapped into the process address space, not the page table pages. So
> ignore my previous reply.
> Do you have any statistics on how often a page fault needs to supply a
> page of zeroes versus supplying a copy of an existing page, for real
When the workload is a gcc run, the pagefault handler dominates the system
time. That's the page zeroing.
> In any case, unless you have magic page-zeroing hardware, I am still
> inclined to think that zeroing the page at the time of the fault is
> the most efficient, since that means the page will be hot in the cache
> for the process to use. If you zero it earlier using CPU stores, it
> can only cause more overall memory traffic, as far as I can see.
x86's movnta instructions provide a way of initialising memory without
trashing the caches and it has pretty good bandwidth, I believe. We should
wire that up to these patches and see if it speeds things up.
> I did some measurements once on my G5 powermac (running a ppc64 linux
> kernel) of how long clear_page takes, and it only takes 96ns for a 4kB
40GB/s. Is that straight into L1 or does the measurement include writeback?
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/