Re: Interacting with coherent memory on external devices

From: Paul E. McKenney
Date: Sat Apr 25 2015 - 07:46:57 EST

On Fri, Apr 24, 2015 at 03:00:18PM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
> > > Still no answer as to why is that not possible with the current scheme?
> > > You keep on talking about pointers and I keep on responding that this is a
> > > matter of making the address space compatible on both sides.
> >
> > So if do that in a naive way, how can we migrate a chunk of memory to video
> > memory while still handling properly the case where CPU try to access that
> > same memory while it is migrated to the GPU memory.
> Well that the same issue that the migration code is handling which I
> submitted a long time ago to the kernel.

Would you have a URL or other pointer to this code?

> > Without modifying a single line of mm code, the only way to do this is to
> > either unmap from the cpu page table the range being migrated or to mprotect
> > it in some way. In both case the cpu access will trigger some kind of fault.
> Yes that is how Linux migration works. If you can fix that then how about
> improving page migration in Linux between NUMA nodes first?

In principle, that also would be a good thing. But why do that first?

> > This is not the behavior we want. What we want is same address space while
> > being able to migrate system memory to device memory (who make that decision
> > should not be part of that discussion) while still gracefully handling any
> > CPU access.
> Well then there could be a situation where you have concurrent write
> access. How do you reconcile that then? Somehow you need to stall one or
> the other until the transaction is complete.

Or have store buffers on one or both sides.

> > This means if CPU access it we want to migrate memory back to system memory.
> > To achieve this there is no way around adding couple of if inside the mm
> > page fault code path. Now do you want each driver to add its own if branch
> > or do you want a common infrastructure to do just that ?
> If you can improve the page migration in general then we certainly would
> love that. Having faultless migration is certain a good thing for a lot of
> functionality that depends on page migration.

We do have to start somewhere, though. If we insist on perfection for
all situations before we agree to make a change, we won't be making very
many changes, now will we?

> > As i keep saying the solution you propose is what we have today, today we
> > have fake share address space through the trick of remapping system memory
> > at same address inside the GPU address space and also enforcing the use of
> > a special memory allocator that goes behind the back of mm code.
> Hmmm... I'd like to know more details about that.

As I understand it, the trick (if you can call it that) is having the
device have the same memory-mapping capabilities as the CPUs.

> > As you pointed out, not using GPU memory is a waste and we want to be able
> > to use it. Now Paul have more sofisticated hardware that offer oportunities
> > to do thing in a more transparent and efficient way.
> Does this also work between NUMA nodes in a Power8 system?

Heh! At the rate we are going with this discussion, Power8 will be
obsolete before we have this in. ;-)

Thanx, Paul

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at