Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone

From: Jeff Moyer
Date: Mon May 07 2018 - 15:08:41 EST


Dan Williams <dan.j.williams@xxxxxxxxx> writes:

> On Mon, May 7, 2018 at 11:46 AM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>> On Mon, May 07, 2018 at 10:50:21PM +0800, Huaisheng Ye wrote:
>>> Traditionally, NVDIMMs are treated by mm(memory management) subsystem as
>>> DEVICE zone, which is a virtual zone and both its start and end of pfn
>>> are equal to 0, mm wouldnât manage NVDIMM directly as DRAM, kernel uses
>>> corresponding drivers, which locate at \drivers\nvdimm\ and
>>> \drivers\acpi\nfit and fs, to realize NVDIMM memory alloc and free with
>>> memory hot plug implementation.
>>
>> You probably want to let linux-nvdimm know about this patch set.
>> Adding to the cc.
>
> Yes, thanks for that!
>
>> Also, I only received patch 0 and 4. What happened
>> to 1-3,5 and 6?
>>
>>> With current kernel, many mmâs classical features like the buddy
>>> system, swap mechanism and page cache couldnât be supported to NVDIMM.
>>> What we are doing is to expand kernel mmâs capacity to make it to handle
>>> NVDIMM like DRAM. Furthermore we make mm could treat DRAM and NVDIMM
>>> separately, that means mm can only put the critical pages to NVDIMM

Please define "critical pages."

>>> zone, here we created a new zone type as NVM zone. That is to say for
>>> traditional(or normal) pages which would be stored at DRAM scope like
>>> Normal, DMA32 and DMA zones. But for the critical pages, which we hope
>>> them could be recovered from power fail or system crash, we make them
>>> to be persistent by storing them to NVM zone.

[...]

> I think adding yet one more mm-zone is the wrong direction. Instead,
> what we have been considering is a mechanism to allow a device-dax
> instance to be given back to the kernel as a distinct numa node
> managed by the VM. It seems it times to dust off those patches.

What's the use case? The above patch description seems to indicate an
intent to recover contents after a power loss. Without seeing the whole
series, I'm not sure how that's accomplished in a safe or meaningful
way.

Huaisheng, could you provide a bit more background?

Thanks!
Jeff