On Fri, 2016-07-01 at 10:46 -0700, Dave Hansen wrote:
The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights
Landing) has an erratum where a processor thread setting the Accessed
or Dirty bits may not do so atomically against its checks for the
Present bit. This may cause a thread (which is about to page fault)
to set A and/or D, even though the Present bit had already been
atomically cleared.
Interesting.... I always wondered where in the Intel docs did it specify
that present was tested atomically with setting of A and D ... I couldn't
find it.
Isn't there a more fundamental issue however that you may actually lose
those bits ? For example if we do an munmap, in zap_pte_range()
We first exchange all the PTEs with 0 with ptep_get_and_clear_full()
and we then transfer D that we just read into the struct page.
We rely on the fact that D will never be set again, what we go it a
"final" D bit. IE. We rely on the fact that a processor either:
- Has a cached PTE in its TLB with D set, in which case it can still
write to the page until we flush the TLB or
- Doesn't have a cached PTE in its TLB with D set and so will fail
to do so due to the atomic P check, thus never writing.
With the errata, don't you have a situation where a processor in the second
category will write and set D despite P having been cleared (due to the
race) and thus causing us to miss the transfer of that D to the struct
page and essentially completely miss that the physical page is dirty ?
(Leading to memory corruption).