RE: [RFC] memcpy_nocache() and memcpy_writethrough()

From: Elliott, Robert (Persistent Memory)
Date: Sun Jan 01 2017 - 21:43:22 EST


> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx [mailto:linux-kernel-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Al Viro
> Sent: Friday, December 30, 2016 8:26 PM
> Subject: [RFC] memcpy_nocache() and memcpy_writethrough()
>
...
> Why does pmem need writethrough warranties, anyway?

Using either
* nontemporal store instructions; or
* following regular store instructions with a sequence of cache flush
and store fence instructions (e.g., clflushopt or clwb + sfence)

ensures that write data has reached an "ADR-safe zone" that the system
promises will be persistent even if there is a surprise power loss or
a CPU suffers from an error that isn't totally catastrophic (e.g., the
CPU getting disconnected from the SDRAM will always lose data on an
NVDIMM-N).

The ACPI NFIT Flush Hints provide a guarantee that data is safe even
in the case of a CPU error, but that feature is not present in all
systems for all types of persistent memory.

> All explanations I've found on the net had been along the lines of
> "we should not store a pointer to pmem data structure until the
> structure itself had been committed to pmem itself" and it looks
> like something that ought to be a job for barriers - after all,
> we don't want the pointer store to be observed by _anything_
> in the system until the earlier stores are visible, so what makes
> pmem different from e.g. another CPU or a PCI busmaster, or...

Newly written data becomes globally visible before it becomes ADR-safe.
This means software could act on the new data before a power loss, then
see the old data reappear after the power loss - not good. Software
needs to understand that any data in the process of being written is
indeterminate until the persistence guarantee is met. The BTT shows
one way that software can avoid that problem.

---
Robert Elliott, HPE Persistent Memory