Re: [RFC PATCH] mm/hmm, mm/migrate_device: Allow p2p access and p2p migration
From: Thomas Hellström
Date: Tue Oct 15 2024 - 08:41:45 EST
Hi, Jason.
Thanks for the feedback.
On Tue, 2024-10-15 at 09:17 -0300, Jason Gunthorpe wrote:
> On Tue, Oct 15, 2024 at 01:13:22PM +0200, Thomas Hellström wrote:
> > Introduce a way for hmm_range_fault() and migrate_vma_setup() to
> > identify
> > foreign devices with fast interconnect and thereby allow
> > both direct access over the interconnect and p2p migration.
> >
> > The need for a callback arises because without it, the p2p ability
> > would
> > need to be static and determined at dev_pagemap creation time. With
> > a callback it can be determined dynamically, and in the migrate
> > case
> > the callback could separate out local device pages.
>
>
> > +static bool hmm_allow_devmem(struct hmm_range *range, struct page
> > *page)
> > +{
> > + if (likely(page->pgmap->owner == range-
> > >dev_private_owner))
> > + return true;
> > + if (likely(!range->p2p))
> > + return false;
> > + return range->p2p->ops->p2p_allow(range->p2p, page);
> > +}
> > +
> > static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long
> > addr,
> > unsigned long end, pmd_t *pmdp,
> > pte_t *ptep,
> > unsigned long *hmm_pfn)
> > @@ -248,8 +258,7 @@ static int hmm_vma_handle_pte(struct mm_walk
> > *walk, unsigned long addr,
> > * just report the PFN.
> > */
> > if (is_device_private_entry(entry) &&
> > - pfn_swap_entry_to_page(entry)->pgmap->owner ==
> > - range->dev_private_owner) {
> > + hmm_allow_devmem(range,
> > pfn_swap_entry_to_page(entry))) {
> > cpu_flags = HMM_PFN_VALID;
> > if
> > (is_writable_device_private_entry(entry))
> > cpu_flags |= HMM_PFN_WRITE;
>
> This is really misnamed and took me a while to get it.
>
> It has nothing to do with kernel P2P, you are just allowing more
> selective filtering of dev_private_owner. You should focus on that in
> the naming, not p2p. ie allow_dev_private()
>
> P2P is stuff that is dealing with MEMORY_DEVICE_PCI_P2PDMA.
Yes, although the intention was to incorporate also other fast
interconnects in "P2P", not just "PCIe P2P", but I'll definitely take a
look at the naming.
>
> This is just allowing more instances of the same driver to co-
> ordinate
> their device private memory handle, for whatever purpose.
Exactly, or theoretically even cross-driver.
>
> Otherwise I don't see a particular problem, though we have talked
> about widening the matching for device_private more broadly using
> some
> kind of grouping tag or something like that instead of a callback.
> You
> may consider that as an alternative
Yes. Looked at that, but (if I understand you correctly) that would be
the case mentioned in the commit message where the group would be set
up statically at dev_pagemap creation time?
>
> I would also probably try to have less indirection, you can embedd
> the
> hmm_range struct inside a caller private data struct and use that
> instead if inventing a whole new struct and pointer.
Our first attempt was based on that but then that wouldn't be reusable
in the migrate_device.c code. Hence the extra indirection.
Thanks,
Thomas
>
> Jason