Re: [PATCH rdma-next 0/8] Cleanup and fix the CMA state machine
From: Jason Gunthorpe
Date: Thu Sep 17 2020 - 08:53:35 EST
On Wed, Sep 02, 2020 at 11:11:14AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@xxxxxxxxxx>
> >From Jason:
> The RDMA CMA continues to attract syzkaller bugs due to its somewhat loose
> operation of its FSM. Audit and scrub the whole thing to follow modern
> Overall the design elements are broadly:
> - The ULP entry points MUST NOT run in parallel with each other. The ULP
> is solely responsible for preventing this.
> - If the ULP returns !0 from it's event callback it MUST guarentee that no
> other ULP threads are touching the cm_id or calling into any RDMA CM
> entry point.
> - ULP entry points can sometimes run conurrently with handler callbacks,
> although it is tricky because there are many entry points that exist
> in the flow before the handler is registered.
> - Some ULP entry points are called from the ULP event handler callback,
> under the handler_mutex. (however ucma never does this)
> - state uses a weird double locking scheme, in most cases one should hold
> the handler_mutex. (It is somewhat unclear what exactly the spinlock is
> - Reading the state without holding the spinlock should use READ_ONCE,
> even if the handler_mutex is held.
> - There are certain states which are 'stable' under the handler_mutex,
> exit from that state requires also holding the handler_mutex. This
> explains why testing the test under only the handler_mutex makes sense.
> Jason Gunthorpe (8):
> RDMA/cma: Fix locking for the RDMA_CM_CONNECT state
> RDMA/cma: Make the locking for automatic state transition more clear
> RDMA/cma: Fix locking for the RDMA_CM_LISTEN state
> RDMA/cma: Remove cma_comp()
> RDMA/cma: Combine cma_ndev_work with cma_work
> RDMA/cma: Remove dead code for kernel rdmacm multicast
> RDMA/cma: Consolidate the destruction of a cma_multicast in one place
> RDMA/cma: Fix use after free race in roce multicast join
Applied to for-next