Re: [PATCH] RDMA/rxe: Use kzalloc() to alloc map_set

From: lizhijian@xxxxxxxxxxx
Date: Tue May 24 2022 - 00:00:36 EST


Hi Jason & Bob
CC Guoqing

@Guoqing, It may correlate with your previous bug report: https://lore.kernel.org/all/20220210073655.42281-1-guoqing.jiang@xxxxxxxxx/T/


It's observed that a same MR in rnbd server will trigger below code
path:
-> rxe_mr_init_fast()
|-> alloc map_set() # map_set is uninitialized
|...-> rxe_map_mr_sg() # build the map_set
|-> rxe_mr_set_page()
|...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means
# we can access host memory(such rxe_mr_copy)
|...-> rxe_invalidate_mr() # mr->state change to FREE from VALID
|...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE,
# but map_set was not built again
|...-> rxe_mr_copy() # kernel crash due to access wild addresses
# that lookup from the map_set

I draft a patch like below for it, but i wonder if it's rxe's responsibility to do such checking.
Any comments are very welcome.


From e9d0bd821f07f5e049027f07b3ce9dc283624201 Mon Sep 17 00:00:00 2001
From: Li Zhijian <lizhijian@xxxxxxxxxxx>
Date: Tue, 24 May 2022 10:56:19 +0800
Subject: [PATCH] RDMA/rxe: check map_set valid when handle IB_WR_REG_MR

It's observed that a same MR in rnbd server will trigger below code
path:
-> rxe_mr_init_fast()
|-> alloc map_set() # map_set is uninitialized
|...-> rxe_map_mr_sg() # build the map_set
|-> rxe_mr_set_page()
|...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means
# we can access host memory(such rxe_mr_copy)
|...-> rxe_invalidate_mr() # mr->state change to FREE from VALID
|...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE,
# but map_set was not built again
|...-> rxe_mr_copy() # kernel crash due to access wild addresses
# that lookup from the map_set

Signed-off-by: Li Zhijian <lizhijian@xxxxxxxxxxx>
---
drivers/infiniband/sw/rxe/rxe_mr.c | 9 +++++++++
drivers/infiniband/sw/rxe/rxe_verbs.c | 1 +
drivers/infiniband/sw/rxe/rxe_verbs.h | 1 +
3 files changed, 11 insertions(+)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 787c7dadc14f..09673d559c06 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -90,6 +90,7 @@ static int rxe_mr_alloc_map_set(int num_map, struct rxe_map_set **setp)
if (!set->map)
goto err_free_set;

+ set->valid = false;
for (i = 0; i < num_map; i++) {
set->map[i] = kmalloc(sizeof(struct rxe_map), GFP_KERNEL);
if (!set->map[i])
@@ -216,6 +217,7 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
}

set = mr->cur_map_set;
+ set->valid = true;
set->page_shift = PAGE_SHIFT;
set->page_mask = PAGE_SIZE - 1;

@@ -643,6 +645,7 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey)
}

mr->state = RXE_MR_STATE_FREE;
+ mr->cur_map_set->valid = mr->next_map_set->valid = false;
ret = 0;

err_drop_ref:
@@ -679,12 +682,18 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
return -EINVAL;
}

+ if (!mr->next_map_set->valid) {
+ pr_warn("%s: map set is not valid\n", __func__);
+ return -EINVAL;
+ }
+
mr->access = access;
mr->lkey = (mr->lkey & ~0xff) | key;
mr->rkey = (access & IB_ACCESS_REMOTE) ? mr->lkey : 0;
mr->state = RXE_MR_STATE_VALID;

set = mr->cur_map_set;
+ set->valid = false;
mr->cur_map_set = mr->next_map_set;
mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova;
mr->next_map_set = set;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 58e4412b1d16..4b7ae2d1d921 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -992,6 +992,7 @@ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
set->page_shift = ilog2(ibmr->page_size);
set->page_mask = ibmr->page_size - 1;
set->offset = set->iova & set->page_mask;
+ set->valid = true;

return n;
}
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 86068d70cd95..2edf31aab7e1 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -289,6 +289,7 @@ struct rxe_map {

struct rxe_map_set {
struct rxe_map **map;
+ bool valid;
u64 va;
u64 iova;
size_t length;
--
2.31.1


On 23/05/2022 22:02, Li, Zhijian wrote:
>
> on 2022/5/20 22:45, Jason Gunthorpe wrote:
>> On Wed, May 18, 2022 at 12:37:25PM +0800, Li Zhijian wrote:
>>> Below call chains will alloc map_set without fully initializing map_set.
>>> rxe_mr_init_fast()
>>>   -> rxe_mr_alloc()
>>>      -> rxe_mr_alloc_map_set()
>>>
>>> Uninitialized values inside struct rxe_map_set are possible to cause
>>> kernel panic.
>> If the value is uninitialized then why is 0 an OK value?
>>
>> Would be happier to know the exact value that is not initialized
>
> Well, good question. After re-think of this issue, it seems this patch wasn't the root cause though it made the crash disappear in some extent.
>
> I'm still working on the root cause :)
>
> Thanks
>
> Zhijian
>
>
>>
>> Jason
>
>