Re: [PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits erratum

From: Dave Hansen
Date: Wed Jul 13 2016 - 10:05:20 EST


On 07/13/2016 04:37 AM, Vlastimil Babka wrote:
> On 07/02/2016 12:28 AM, Benjamin Herrenschmidt wrote:
>> With the errata, don't you have a situation where a processor in
>> the second category will write and set D despite P having been
>> cleared (due to the race) and thus causing us to miss the transfer
>> of that D to the struct
>> page and essentially completely miss that the physical page is dirty ?
>
> Seems to me like this is indeed possible, but...

No, this isn't possible with the erratum.

I had some off-list follow up with Ben, and included this description in
the later post of the patch:
> These bits are truly "stray". In the case of the Dirty bit, the
> thread associated with the stray set was *not* allowed to write to
> the page. This means that we do not have to launder the bit(s); we
> can simply ignore them.


>> (Leading to memory corruption).
>
> ... what memory corruption, exactly?

In this (non-existent) scenario, we would lose writes to mmap()'d files
because we did not see the dirty bit during the "get" part of
ptep_get_and_clear().

> If a process is writing to its
> memory from one thread and unmapping it from other thread at the same
> time, there are no guarantees anyway?

It's not just unmapping, it's also swap, NUMA migration, etc... We
clear the PTE, flush, then re-populate it.

> Would anything sensible rely on
> the guarantee that if the write in such racy scenario didn't end up as a
> segfault (i.e. unmapping was faster), then it must hit the disk? Or are
> there any other scenarios where zap_pte_range() is called? Hmm, but how
> does this affect the page migration scenario, can we lose the D bit there?

Yeah, it's not just zap_pte_range(), it's everywhere that we change a
present PTE.

> And maybe related thing that just occured to me, what if page is made
> non-writable during fork() to catch COW? Any race in that one, or just
> the P bit? But maybe the argument would be the same as above...

Yeah, the argument is the same.