RE: [PATCH v3 2/9] vfio/pci: Add a helper to look up PFNs for DMABUFs
From: Tian, Kevin
Date: Tue Jun 16 2026 - 05:31:39 EST
> From: Matt Evans <matt@xxxxxxxxxx>
> Sent: Tuesday, June 16, 2026 2:04 AM
>
> On 12/06/2026 09:42, Tian, Kevin wrote:
> >> From: Matt Evans <matt@xxxxxxxxxx>
> >> Sent: Wednesday, June 10, 2026 11:43 PM
> >>
> >> +int vfio_pci_dma_buf_find_pfn(struct vfio_pci_dma_buf *priv,
> >> + struct vm_area_struct *vma,
> >> + unsigned long address,
> >> + unsigned int order,
> >> + unsigned long *out_pfn)
> >> +{
> >> + /*
> >> + * Given a VMA (start, end, pgoffs) and a fault address,
> >> + * search the corresponding DMABUF's phys_vec[] to find the
> >> + * range representing the address's offset into the VMA, and
> >> + * its PFN.
> >> + *
> >> + * The phys_vec[] ranges represent contiguous spans of VAs
> >> + * upwards from the buffer offset 0; the actual PFNs might be
> >> + * in any order, overlap/alias, etc. Calculate an offset of
> >> + * the desired page given VMA start/pgoff and address, then
> >> + * search upwards from 0 to find which span contains it.
> >> + *
> >> + * On success, a valid PFN for a page sized by 'order' is
> >> + * returned into out_pfn.
> >> + *
> >> + * Failure occurs if:
> >> + * - The page would cross the edge of the VMA
> >> + * - The page isn't entirely contained within a range
> >> + * - We find a range, but the final PFN isn't aligned to the
> >> + * requested order.
> >> + *
> >> + * (Upon failure, the caller is expected to try again with a
> >> + * smaller order; the tests above will always succeed for
> >> + * order=0 as the limit case.)
> >> + *
> >> + * It's suboptimal if DMABUFs are created with neigbouring
> >
> > s/neigbouring/neighboring/
>
> Ah, not a typo. :) That is en_GB and AFAIK is permitted.
I guess you meant 'neighbouring' and 'neighboring' are both valid.
but here lacking a 'h' should be a typo? :)
> >> + */
> >> +
> >> + const unsigned long pagesize = PAGE_SIZE << order;
> >> + unsigned long vma_off = ((vma->vm_pgoff - priv->vma_pgoff_adjust)
> >> <<
> >> + PAGE_SHIFT) & VFIO_PCI_OFFSET_MASK;
> >> + unsigned long rounded_page_addr = ALIGN_DOWN(address,
> >> pagesize);
> >> + unsigned long rounded_page_end = rounded_page_addr + pagesize;
> >> + unsigned long page_buf_offset;
> >> + unsigned long page_buf_offset_end;
> >
> > what about "fault_offset[_end]"? page_buf is a bit confusing.
>
> I went round several times with these names, thanks for the input. Just
> tried it out and your suggestion is clearer.
>
> >> + unsigned long range_buf_offset = 0;
> >
> > could this be called 'range_start' then the 'range_start' in latter loop
> > is renamed to 'phys_start'?
> >
> > Not strong... just feel such naming helps me understand the logic easier
>
> Anything that helps helps, thanks. I ended up renaming this to
> range_start_offset (as offset is IMHO important).
>
> I'm a fan of diagrams but this is too large to include in a comment.
> But for posterity on the list, and using the new names, an illustration
> of a DMABUF with 3 ranges in phys_vec, where a mapping's
> faulting page offset lies in range [1]:
>
> fault_addr--+
> v VMA
> +-----------------+----------+-----------------+
> | | Faulting | |
> | | (hg)page | |
> | | | |
> |---- vma_off ---->+-----------------+----------+-----------------+
> | . .
> | . .
> |--------- fault_offset ------------>. . DMABUF
> +-------------------------+---------------------------+--------------+
> | phys_vec[0] | phys_vec[1] . | phys_vec[2] |
> | .paddr | . . | |
> | .len | . . | |
> +-------------------------+---------------------------+--------------+
> 0 : . . : L
> |-- range_start_offset -->: . . -->: range_len
> : . . :
> V . . :
> +----------+----------+-----+
> |.paddr | PFN | |
> | | | |
> | | | |
> +----------+----------+-----+
> P
>
> P = paddr + (fault_offset - range_start_offset)
> L = sum(phys_vec[0...2].len)
>
yes, much clearer now.