On Thu, Apr 27, 2006 at 01:58:42AM +1000, Nick Piggin wrote:
Keir Fraser wrote:
In more detail the problem is that, since we're still running on the page tables while clearing them, the CPU may choose to prefetch a half-cleared pte into its TLB, and then execute speculative memory accesses based on that mapping (including ones that may write-allocate cachelines, leading to problems like the AMD AGP GART deadlock Linux had a year or so back).
What do you mean, you're still running on the page tables? The CPU can
still walk the pagetables?
Because if ptep_get_and_clear_full is passed non zero in the full
argument, then that specific translation should never see another
access. I didn't know CPUs now actually resolve TLB misses as part of
speculative prefetching... does this really happen?
For instance, during speculative execution on POWER, we can take a
TLB miss for a speculative load and start a table-walk.
I'm not sure what "speculative prefetching" means in this case... just
regular hardware-initiated prefetching (where I suppose one could use the
modifier "speculative") on POWER will only prefetch to a page-boundary.
So, slightly OT, as this is not about x86 CPUs... but thought people
might be interested.