Re: [PATCH V3 1/4] mm: Define coherent device memory (CDM) node

From: Anshuman Khandual
Date: Tue Feb 21 2017 - 05:20:55 EST

On 02/17/2017 07:35 PM, Bob Liu wrote:
> Hi Anshuman,
> I have a few questions about coherent device memory.


> On Wed, Feb 15, 2017 at 8:07 PM, Anshuman Khandual
> <khandual@xxxxxxxxxxxxxxxxxx> wrote:
>> There are certain devices like specialized accelerator, GPU cards, network
>> cards, FPGA cards etc which might contain onboard memory which is coherent
>> along with the existing system RAM while being accessed either from the CPU
>> or from the device. They share some similar properties with that of normal
> What's the general size of this kind of memory?

Its more comparable to available system RAM sizes and also not as high as
persistent storage memory or NVDIMM.

>> system RAM but at the same time can also be different with respect to
>> system RAM.
>> User applications might be interested in using this kind of coherent device
> What kind of applications?

Applications which want to use CPU compute as well device compute on the
same allocated buffer transparently. Applications for example want to
load the problem statement on the allocated buffer and ask the device
through driver to compute results out of the problem statement.

>> memory explicitly or implicitly along side the system RAM utilizing all
>> possible core memory functions like anon mapping (LRU), file mapping (LRU),
>> page cache (LRU), driver managed (non LRU), HW poisoning, NUMA migrations
> I didn't see the benefit to manage the onboard memory same way as system RAM.
> Why not just map this kind of onborad memory to userspace directly?
> And only those specific applications can manage/access/use it.

Integration with core MM along with driver assisted migrations gives the
application the ability to use the allocated buffer seamlessly from the
CPU or the device without bothering about actual physical placement of
the pages. That changes the paradigm of cpu and device based hybrid
compute framework which can not be achieved by mapping the device memory
directly to the user space.

> It sounds not very good to complicate the core memory framework a lot
> because of some not widely used devices and uncertain applications.

Applications are not uncertain, they intend to use these framework to
achieve hybrid cpu/device compute working transparently on the same
allocated virtual buffer. IIUC we would want Linux kernel to enable
new device technologies regardless whether they are widely used or