Re: Interacting with coherent memory on external devices

From: Christoph Lameter
Date: Thu Apr 23 2015 - 10:38:24 EST


On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:

> In fact I'm quite surprised, what we want to achieve is the most natural
> way from an application perspective.

Well the most natural thing would be if the beast would just do what I
tell it in plain english. But then I would not have my job anymore.

> You have something in memory, whether you got it via malloc, mmap'ing a file,
> shmem with some other application, ... and you want to work on it with the
> co-processor that is residing in your address space. Even better, pass a pointer
> to it to some library you don't control which might itself want to use the
> coprocessor ....

Yes that works already. Whats new about this? This seems to have been
solved on the Intel platform f.e.

> What you propose can simply not provide that natural usage model with any
> efficiency.

There is no effiecency anymore if the OS can create random events in a
computational stream that is highly optimized for data exchange of
multiple threads at defined time intervals. If transparency or the natural
usage model can avoid this then ok but what I see here proposed is some
behind-the-scenes model that may severely degrate performance. And this
does seem to go way beyond CAPI. At leasdt the way I so far thought about
this as a method for cache coherency at the cache line level and about a
way to simplify the coordination of page tables and TLBs across multiple
divergent architectures.

I think these two things need to be separated. The shift-the-memory-back-
and-forth approach should be separate and if someone wants to use the
thing then it should also work on other platforms like ARM and Intel.

CAPI needs to be implemented as a way to potentially improve the existing
communication paths between devices and the main processor. F.e the
existing Infiniband MMU synchronization issues and RDMA registration
problems could be addressed with this. The existing mechanisms for GPU
communication could become much cleaner and easier to handle. This is all
good but independant of any "transparent" memory implementation.

> It might not be *your* model based on *your* application but that doesn't mean
> it's not there, and isn't relevant.

Sadly this is the way that an entire industry does its thing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/