Re: [v1 0/2] "Hotremove" persistent memory

From: Dan Williams
Date: Sat Apr 20 2019 - 12:34:40 EST

On Sat, Apr 20, 2019 at 8:32 AM Pavel Tatashin
<pasha.tatashin@xxxxxxxxxx> wrote:
> Recently, adding a persistent memory to be used like a regular RAM was
> added to Linux. This work extends this functionality to also allow hot
> removing persistent memory.
> We (Microsoft) have a very important use case for this functionality.
> The requirement is for physical machines with small amount of RAM (~8G)
> to be able to reboot in a very short period of time (<1s). Yet, there is
> a userland state that is expensive to recreate (~2G).
> The solution is to boot machines with 2G preserved for persistent
> memory.

Makes sense, but I have some questions about the details.

> Copy the state, and hotadd the persistent memory so machine still has all
> 8G for runtime. Before reboot, hotremove device-dax 2G, copy the memory
> that is needed to be preserved to pmem0 device, and reboot.
> The series of operations look like this:
> 1. After boot restore /dev/pmem0 to boot
> 2. Convert raw pmem0 to devdax
> ndctl create-namespace --mode devdax --map mem -e namespace0.0 -f
> 3. Hotadd to System RAM
> echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
> echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
> 4. Before reboot hotremove device-dax memory from System RAM
> echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind
> 5. Create raw pmem0 device
> ndctl create-namespace --mode raw -e namespace0.0 -f
> 6. Copy the state to this device

What is the source of this copy? The state that was in the hot-added
memory? Isn't it "already there" since you effectively renamed dax0.0
to pmem0?

> 7. Do kexec reboot, or reboot through firmware, is firmware does not
> zero memory in pmem region.

Wouldn't the dax0.0 contents be preserved regardless? How does the
guest recover the pre-initialized state / how does the kernel know to
give out the same pages to the application as the previous boot?