Re: [PATCH RFC] rxe: Fix iova-to-va conversion for MR page sizes != PAGE_SIZE

From: Zhijian Li (Fujitsu)

Date: Tue Jan 06 2026 - 22:02:30 EST




On 06/01/2026 09:07, Jason Gunthorpe wrote:
> On Mon, Jan 05, 2026 at 06:55:22AM +0000, Zhijian Li (Fujitsu) wrote:
>
>> After digging into the behavior during the srp/012 test again, it
>> turns out this fix is incomplete. The current xarray page_list
>> approach cannot correctly map memory regions composed of two or more
>> scatter-gather segments.
> I seem to recall there are DMA API functions that can control what
> kinds of scatterlists the block stack will push down.
>
> For real HW we already cannot support less than 4K alignment of interior SGL
> segments.
>
> Maybe rxe can tell the block stack it can only support PAGE_SIZE
> alignment of interior SGL segments?
>
> If not then this would be the reason rxe needs mr->page_size, to
> support 4k.
>

I agree that we should support smaller page sizes like 4K.
Some ULPs indeed have hardcoded assumptions about it.


> And obviously if the mr->page size is less than PAGE_SIZE the xarray
> datastructure does not work. You'd have to store physical addresses
> instead..



You're absolutely right that the current xarray of struct page pointers
is fundamentally flawed for this use case (both for mr->page_size < PAGE_SIZE
and for non-PAGE_SIZE aligned interior segments).

Storing the DMA addresses directly, as you suggested, seems like a much
more robust path forward. I will explore this approach.
This seems to revert back to 592627ccbdff ("RDMA/rxe: Replace rxe_map and rxe_phys_buf by xarray")

Thanks
Zhijian

>
> Jason