Re: [PATCH v6 5/6] nouveau: use new mmu interval notifiers

From: Ralph Campbell
Date: Wed Feb 19 2020 - 20:10:54 EST



On 1/16/20 12:21 PM, Jason Gunthorpe wrote:
On Thu, Jan 16, 2020 at 12:16:30PM -0800, Ralph Campbell wrote:
Can you point me to the latest ODP code? Seems like my understanding is
quite off.

https://elixir.bootlin.com/linux/v5.5-rc6/source/drivers/infiniband/hw/mlx5/odp.c

Look for the word 'implicit'

mlx5_ib_invalidate_range() releases the interval_notifier when there are
no populated shadow PTEs in its leaf

pagefault_implicit_mr() creates an interval_notifier that covers the
level in the page table that needs population. Notice it just uses an
unlocked xa_load to find the page table level.

The locking is pretty tricky as it relies on RCU, but the fault flow
is fairly lightweight.

Jason

Thanks for the information, Jason.

I'm still interested in finding a way to support range based hints to device drivers.
madvise() looks like it only sets a bit in vma->vm_flags or acts on the
advice immediately. mbind() and set_mempolicy() only work with CPUs and memory
with NUMA a node number. What I'm looking for is a way for the device to know
whether to migrate pages to device private memory on a fault, whether to duplicate
read-only pages in device private memory, or remote map/access a page instead of migrating it.
For example, there is a working draft extension to OpenCL,
https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/cl_intel_unified_shared_memory.asciidoc
that could provide a way to specify this sort of advice.
C++ is also looking at extentions for specifying affinity attributes.
In any case, these are probably a long ways off before being finalized and implemented.

I also have some changes to support THP migration to device private memory but that
will require updating nouveau to use 2MB TLB mappings.

In the mean time, I can update the HMM self tests to do something like ODP without
changing mm/mmu_notifier.c but I don't think I can easily change nouveau to that model.