Re: [PATCH RFC 1/2] RDMA/rxe: don't allow registering !PAGE_SIZE mr

From: Jason Gunthorpe
Date: Tue Oct 31 2023 - 09:19:39 EST


On Tue, Oct 31, 2023 at 04:52:23PM +0800, Zhu Yanjun wrote:
> 在 2023/10/30 20:40, Jason Gunthorpe 写道:
> > On Mon, Oct 30, 2023 at 07:51:41AM +0000, Zhijian Li (Fujitsu) wrote:
> > >
> > >
> > > On 27/10/2023 13:41, Li Zhijian wrote:
> > > > mr->page_list only encodes *page without page offset, when
> > > > page_size != PAGE_SIZE, we cannot restore the address with a wrong
> > > > page_offset.
> > > >
> > > > Note that this patch will break some ULPs that try to register 4K
> > > > MR when PAGE_SIZE is not 4K.
> > > > SRP and nvme over RXE is known to be impacted.
> > > >
> > > > Signed-off-by: Li Zhijian <lizhijian@xxxxxxxxxxx>
> > > > ---
> > > > drivers/infiniband/sw/rxe/rxe_mr.c | 6 ++++++
> > > > 1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> > > > index f54042e9aeb2..61a136ea1d91 100644
> > > > --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> > > > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> > > > @@ -234,6 +234,12 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
> > > > struct rxe_mr *mr = to_rmr(ibmr);
> > > > unsigned int page_size = mr_page_size(mr);
> > > > + if (page_size != PAGE_SIZE) {
> > >
> > > It seems this condition is too strict, it should be:
> > > if (!IS_ALIGNED(page_size, PAGE_SIZE))
> > >
> > > So that, page_size with (N * PAGE_SIZE) can work as previously.
> > > Because the offset(mr.iova & page_mask) will get lost only when !IS_ALIGNED(page_size, PAGE_SIZE)
> >
> > That makes sense
>
> I read all the discussions very carefully.
>
> Thanks, Greg.
>
> Because RXE only supports PAGE_SIZE, when CONFIG_ARM64_64K_PAGES is enabled,
> the PAGE_SIZE is 64K, when CONFIG_ARM64_64K_PAGES is disabled, PAGE_SIZE is
> 4K.
>
> But NVMe calls ib_map_mr_sg with a fixed size SZ_4K. When
> CONFIG_ARM64_64K_PAGES is enabled, it is still 4K. This is not a problem in
> RXE. This problem is in NVMe.

Maybe, but no real RDMA devices don't support 4K.

The xarray conversion may need revision to use physical addresses
instead of storing struct pages so it can handle this kind of
segmentation.

Certainly in the mean time it should be rejected.

Jason