Re: Interacting with coherent memory on external devices

From: Christoph Lameter
Date: Fri Apr 24 2015 - 16:00:26 EST


On Fri, 24 Apr 2015, Jerome Glisse wrote:

> > Still no answer as to why is that not possible with the current scheme?
> > You keep on talking about pointers and I keep on responding that this is a
> > matter of making the address space compatible on both sides.
>
> So if do that in a naive way, how can we migrate a chunk of memory to video
> memory while still handling properly the case where CPU try to access that
> same memory while it is migrated to the GPU memory.

Well that the same issue that the migration code is handling which I
submitted a long time ago to the kernel.

> Without modifying a single line of mm code, the only way to do this is to
> either unmap from the cpu page table the range being migrated or to mprotect
> it in some way. In both case the cpu access will trigger some kind of fault.

Yes that is how Linux migration works. If you can fix that then how about
improving page migration in Linux between NUMA nodes first?

> This is not the behavior we want. What we want is same address space while
> being able to migrate system memory to device memory (who make that decision
> should not be part of that discussion) while still gracefully handling any
> CPU access.

Well then there could be a situation where you have concurrent write
access. How do you reconcile that then? Somehow you need to stall one or
the other until the transaction is complete.

> This means if CPU access it we want to migrate memory back to system memory.
> To achieve this there is no way around adding couple of if inside the mm
> page fault code path. Now do you want each driver to add its own if branch
> or do you want a common infrastructure to do just that ?

If you can improve the page migration in general then we certainly would
love that. Having faultless migration is certain a good thing for a lot of
functionality that depends on page migration.

> As i keep saying the solution you propose is what we have today, today we
> have fake share address space through the trick of remapping system memory
> at same address inside the GPU address space and also enforcing the use of
> a special memory allocator that goes behind the back of mm code.

Hmmm... I'd like to know more details about that.

> As you pointed out, not using GPU memory is a waste and we want to be able
> to use it. Now Paul have more sofisticated hardware that offer oportunities
> to do thing in a more transparent and efficient way.

Does this also work between NUMA nodes in a Power8 system?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/