Re: [Patch v9 12/12] RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter

From: Jason Gunthorpe
Date: Fri Oct 28 2022 - 13:18:28 EST


On Fri, Oct 21, 2022 at 05:01:29PM -0700, longli@xxxxxxxxxxxxxxxxx wrote:
> +int mana_ib_gd_create_dma_region(struct mana_ib_dev *dev, struct ib_umem *umem,
> + mana_handle_t *gdma_region)
> +{
> + struct gdma_dma_region_add_pages_req *add_req = NULL;
> + struct gdma_create_dma_region_resp create_resp = {};
> + struct gdma_create_dma_region_req *create_req;
> + size_t num_pages_cur, num_pages_to_handle;
> + unsigned int create_req_msg_size;
> + struct hw_channel_context *hwc;
> + struct ib_block_iter biter;
> + size_t max_pgs_create_cmd;
> + struct gdma_context *gc;
> + size_t num_pages_total;
> + struct gdma_dev *mdev;
> + unsigned long page_sz;
> + void *request_buf;
> + unsigned int i;
> + int err;
> +
> + mdev = dev->gdma_dev;
> + gc = mdev->gdma_context;
> + hwc = gc->hwc.driver_data;
> +
> + /* Hardware requires dma region to align to chosen page size */
> + page_sz = ib_umem_find_best_pgsz(umem, PAGE_SZ_BM, 0);

Does your HW support arbitary MR offsets in the IOVA?

struct ib_mr *mana_ib_reg_user_mr(struct ib_pd *ibpd, u64 start, u64 length,
u64 iova, int access_flags,
struct ib_udata *udata)
{
[..]

err = mana_ib_gd_create_dma_region(dev, mr->umem,&dma_region_handle);
..
mr_params.gva.virtual_address = iova;

Eg if I set iova to 1 and length to PAGE_SIZE and pass in a umem which
is fully page aligned, will the HW work, or will it DMA to the wrong
locations?

All other RDMA HW requires passing iova to the
ib_umem_find_best_pgsz() specifically to reject/adjust the
misalignment of the IOVA relative to the selected pagesize.

> + __rdma_umem_block_iter_start(&biter, umem, page_sz);
> +
> + for (i = 0; i < num_pages_to_handle; ++i) {
> + dma_addr_t cur_addr;
> +
> + __rdma_block_iter_next(&biter);
> + cur_addr = rdma_block_iter_dma_address(&biter);
> +
> + create_req->page_addr_list[i] = cur_addr;
> + }

This loop is still a mess, why can you not write it as I said for v6?

Usually the way these loops are structured is to fill the array and
then check for fullness, trigger an action to drain the array, and
reset the indexes back to the start.

so do the usual

rdma_umem_for_each_dma_block() {
page_addr_list[tail++] = rdma_block_iter_dma_address(&biter);
if (tail >= num_pages_to_handle) {
mana_gd_send_request()
reset buffer
tail = 0
}
}

if (tail)
mana_gd_send_request()

Jason