Re: [PATCH V2 4/7] dax/fsdev: clamp direct_access return to current physical range

From: John Groves

Date: Sat May 30 2026 - 09:07:59 EST


On 26/05/26 05:00PM, Dave Jiang wrote:
>
>
> On 5/22/26 12:19 PM, John Groves wrote:
> > From: John Groves <John@xxxxxxxxxx>
> >
> > __fsdev_dax_direct_access() returned the number of available pages based
> > on cached_size (the total size across all ranges). For multi-range
> > devices with physical gaps between ranges, this over-reports the number
> > of physically contiguous pages available from the returned kaddr/pfn.
> > Callers trust this return value to mean contiguous pages, so accessing
> > beyond the current range boundary would hit unmapped or unrelated memory.
> >
> > Fix by finding the range that contains the translated physical address
> > and clamping the return to the remaining pages within that range.
> >
> > Also remove the now-unused cached_size field from struct dev_dax, since
> > it was only consumed by the old return calculation.
> >
> > Fixes: 099c81a1f0ab3 ("dax: Add dax_operations for use by fs-dax on fsdev dax")
> > Signed-off-by: John Groves <john@xxxxxxxxxx>
>
> I ran this through Claude and this is what it came back with and it looks reasonable to me:
>
> The claimed bug does not manifest in the current tree. This is a
> refactor + dead-field removal dressed as a bugfix. Either the justification is
> wrong or there's a missing companion change.
>
> The claim doesn't hold up
>
> Commit message:
>
> ▎ __fsdev_dax_direct_access() returned the number of available pages based on
> ▎ cached_size (the total size across all ranges). For multi-range devices with
> ▎ physical gaps between ranges, this over-reports the number of physically
> ▎ contiguous pages available from the returned kaddr/pfn.
>
> Walk through it. __fsdev_dax_direct_access does:
>
> phys = dax_pgoff_to_phys(dev_dax, pgoff, size);
> if (phys == -1)
> return -EFAULT;
> ...
> return PHYS_PFN(min(size, dev_dax->cached_size - offset));
>
> Now look at what dax_pgoff_to_phys (drivers/dax/bus.c:1433) enforces:
>
> phys = PFN_PHYS(pgoff - dax_range->pgoff) + range->start;
> if (phys + size - 1 <= range->end)
> return phys;
> break; /* falls through to return -1 */
>
> It returns valid phys only when the entire requested size fits inside a single
> range. If the request straddles a gap, it returns -1 and the caller bails
> with -EFAULT before ever reaching the clamp.
>
> So in the only path where the clamp runs, we already know size <= range->end -
> phys + 1, which means cached_size - offset >= size, which means min(size,
> cached_size - offset) == size. Pre-patch returns nr_pages. Post-patch returns
> nr_pages. Same answer.
>
> I worked it through with concrete numbers on a 2-range device with a physical
> gap (range[0]=4 pages at 0x1000_0000, range[1]=4 pages at 0x2000_0000):
> - pgoff=2, nr_pages=2: both return 2. ✓
> - pgoff=3, nr_pages=2 (straddles gap): both return -EFAULT via the early bail.
>
> - pgoff=4, nr_pages=2: both return 2. ✓
>
> I cannot construct a case where pre-patch over-reports.
>
> What's actually changed
>
> 1. A no-op clamp loop replaces a constant calculation. Same return value in
> every reachable case.
> 2. The trailing return -EFAULT; is dead code. If dax_pgoff_to_phys returned
> valid phys, then by construction phys lies inside some range in
> dev_dax->ranges[], so the loop will match.
> 3. O(N) walk replaces O(1) arithmetic. Trivial for small nr_range, but it's a
> regression in the wrong direction for a hot path.
> 4. cached_size field is removed. Good cleanup — I grepped, no other consumers.
>
>
> DJ

Thanks Dave. Sashiko reviews can be great, but sometimes they make us (me)
dumber.

Dropping this patch from the series.

John

<snip>