Re: BUG: NULL pointer dereference at ib_uverbs_comp_handler+0x20

From: Logan Gunthorpe
Date: Tue Aug 01 2017 - 14:33:16 EST


Hey,

The patch Bharat provided fixes the kernel panic but RDMA in userspace
still does not work at all. Reverting the commits I mentioned still
fixes everything.

To answer your questions:

* I see the iwpm_register_pid message even when things are working so I
don't think it's related.

* All clients I've tried fail. I've attached a log of all the error
messages I see with various clients. (This was with Bharat's patch so
there was no kernel panic and I saw no dmesgs during these runs). The
same runs with the commits I mentioned reverted work fine.

* I retested everything with the CX4 cards as well and they have a
similar problem but produce different error messages. I've attached a
log of client runs as well. The CX4 also works once I revert those
patches. However, by memory, I don't think the CX4s ever suffered from
the kernel panic, and I guess it was just luck that the patches I
reverted caused all these problems.


On 01/08/17 05:08 AM, Matan Barak wrote:
> PS, e0fcc61113c isn't a bug fix, it's just a simple refactor.

If it's not a bug fix I don't think it should have a fixes tag. It
probably didn't mater in this case but you don't want refactor commits
to accidentally reach a stable kernel.

Thanks,

Logan






gunthorp@cgy1-donard:~$ ib_write_bw -R

************************************
* Waiting for client to connect... *
************************************
Couldn't create rdma QP - Invalid argument
Unable to create QP.
Failed to create QP.
Unable to create the resources needed by comm struct
Unable to perform rdma_server function
Unable to init the socket connection
gunthorp@cgy1-donard:~$ ib_write_bw -R flash-cxgb
Couldn't create rdma QP - Invalid argument
Unable to create QP.
Failed to create QP.
Unable to create the resources needed by comm struct
Unable to perform rdma_client function
Unable to init the socket connection
gunthorp@cgy1-donard:~$ rping -s
rdma_create_qp: Invalid argument
setup_qp failed: -1
gunthorp@cgy1-donard:~$ rping -c -a flash-cxgb -v -C5
rdma_create_qp: Invalid argument
setup_qp failed: -1
gunthorp@cgy1-donard:~$ ucmatose
cmatose: starting server
cmatose: unable to create QP: Invalid argument
cmatose: failing connection request
test complete
return status -1
gunthorp@cgy1-donard:~$ ucmatose -s flash-cxgb
cmatose: starting client
cmatose: connecting
cmatose: unable to create QP: Invalid argument
test complete
return status -1
gunthorp@cgy1-donard:~$ ibv_rc_pingpong
local address: LID 0x0000, QPN 0x000430, PSN 0xef9f8e, GID ::
Failed to modify QP to RTR
Couldn't connect to remote QP
gunthorp@cgy1-donard:~$ ibv_rc_pingpong flash-cxgb
local address: LID 0x0000, QPN 0x000438, PSN 0x9e00c8, GID ::
client read: Success
Couldn't read remote address
gunthorp@cgy1-donard:~$ ib_write_bw -R -d mlx5_0 flash-rdma
Unexpected CM event bl blka 6
Unable to perform rdma_client function
Unable to init the socket connection
gunthorp@cgy1-donard:~$ ib_write_bw -R -d mlx5_0

************************************
* Waiting for client to connect... *
************************************
Function rdma_accept failed
Unable to perform rdma_server function
Unable to init the socket connection
gunthorp@cgy1-donard:~$ rping -s
rdma_accept: Invalid argument
connect error -1
gunthorp@cgy1-donard:~$ rping -c -a flash-rdma
cma event RDMA_CM_EVENT_CONNECT_ERROR, error -1
wait for CONNECTED state 4
connect error -1
gunthorp@cgy1-donard:~$ ucmatose
cmatose: starting server
cmatose: failure accepting: Invalid argument
cmatose: failing connection request
test complete
return status -1
gunthorp@cgy1-donard:~$ ucmatose -s flash-rdma
cmatose: starting client
cmatose: connecting
cmatose: event: RDMA_CM_EVENT_CONNECT_ERROR, error: -1
test complete
return status -1
gunthorp@cgy1-donard:~$ ibv_rc_pingpong
local address: LID 0x0003, QPN 0x000847, PSN 0x2a678f, GID ::
Failed to modify QP to RTR
Couldn't connect to remote QP
gunthorp@cgy1-donard:~$ ibv_rc_pingpong flash-rdma
local address: LID 0x0003, QPN 0x000848, PSN 0xe014bd, GID ::
remote address: LID 0x0002, QPN 0x000849, PSN 0x2bd346, GID ::
Failed to modify QP to RTR