Re: Interacting with coherent memory on external devices

From: Christoph Lameter
Date: Mon Apr 27 2015 - 12:31:21 EST


On Mon, 27 Apr 2015, Paul E. McKenney wrote:

> I would instead look on this as a way to try out use of hardware migration
> hints, which could lead to hardware vendors providing similar hints for
> node-to-node migrations. At that time, the benefits could be provided
> all the functionality relying on such migrations.

Ok that sounds good. These "hints" could allow for the optimization of the
page migration logic.

> > Well yes that works with read-only mappings. Maybe we can special case
> > that in the page migration code? We do not need migration entries if
> > access is read-only actually.
>
> So you are talking about the situation only during the migration itself,
> then? If there is no migration in progress, then of course there is
> no problem with concurrent writes because the cache-coherence protocol
> takes care of things. During migration of a given page, I agree that
> marking that page read-only on both sides makes sense.

This is sortof what happens in the current migration scheme. In the page
tables the regular entries are replaced by migration ptes and the page is
therefore inaccessible. Any access is then trapped until the page
contentshave been moved to the new location. Then the migration pte is
replaced by a real pte again that allows full access to the page. At that
point the processes that have been put to sleep because they attempted an
access to that page are woken up.

The current scheme may be improvied on by allowing read access to the page
while migration is in process. If we would change the migration entries to
allow read access then the readers would not have to be put to sleep. Only
writers would have to be put to sleep until the migration is complete.

> > And I agree that latency-sensitive applications might not tolerate
> the page being read-only, and thus would want to avoid migration.
> Such applications would of course instead rely on placing the memory.

Thats why we have the ability to switch off these automatism and that is
why we are trying to keep the OS away from certain processors.

But this is not the only concern here. The other thing is to make this fit
into existing functionaly as cleanly as possible. So I think we would be
looking at gradual improvements in the page migration logic as well as
in the support for mapping external memory via driver mmap calls, DAX
and/or RDMA subsystem functionality. Those two areas of functionality need
to work together better in order to provide a solution for your use cases.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/