Re: Interacting with coherent memory on external devices

From: Austin S Hemmelgarn
Date: Thu Apr 23 2015 - 11:25:37 EST


On 2015-04-23 10:25, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:

They are via MMIO space. The big differences here are that via CAPI the
memory can be fully cachable and thus have the same characteristics as
normal memory from the processor point of view, and the device shares
the MMU with the host.

Practically what that means is that the device memory *is* just some
normal system memory with a larger distance. The NUMA model is an
excellent representation of it.

I sure wish you would be working on using these features to increase
performance and the speed of communication to devices.

Device memory is inherently different from main memory (otherwise the
device would be using main memory) and thus not really NUMA. NUMA at least
assumes that the basic characteristics of memory are the same while just
the access speeds vary. GPU memory has very different performance
characteristics and the various assumptions on memory that the kernel
makes for the regular processors may not hold anymore.

You are restricting your definition of NUMA to what the industry constrains it to mean. Based solely on the academic definition of a NUMA system, this _is_ NUMA. In fact, based on the academic definition, all modern systems could be considered to be NUMA systems, with each level of cache representing a memory only node.

Looking at this whole conversation, all I see is two different views on how to present the asymmetric multiprocessing arrangements that have become commonplace in today's systems to userspace. Your model favors performance, while CAPI favors simplicity for userspace.


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature