Re: [syzbot] KASAN: use-after-free Read in addr_handler (4)

From: Jason Gunthorpe
Date: Thu Sep 16 2021 - 13:52:23 EST


On Thu, Sep 16, 2021 at 04:45:27PM +0200, Dmitry Vyukov wrote:

> Answering your question re what was running concurrently with what.
> Each of the syscalls in these programs can run up to 2 times and
> ultimately any of these calls can race with any. Potentially syzkaller
> can predict values kernel will return (e.g. id's) before kernel
> actually returned them. I guess this does not restrict search area for
> the bug a lot...

I have a reasonable theory now..

Based on the ops you provided this FSM sequence is possible

RDMA_USER_CM_CMD_RESOLVE_IP
RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
does rdma_resolve_ip(addr_handler)

addr_handler
RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
[.. handler still running ..]

RDMA_USER_CM_CMD_RESOLVE_IP
RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
does rdma_resolve_ip(addr_handler)

RDMA_DESTROY_ID
rdma_addr_cancel()

Which, if it happens fast enough, could trigger a situation where the
'&id_priv->id.route.addr.dev_addr' "handle" is in the req_list twice
beacause the addr_handler work queue hasn't yet got to the point of
deleting it from the req_list before the the 2nd one is added.

The issue is rdma_addr_cancel() has to be called rdma_resolve_ip() can
be called again.

Skipping it will cause 'req_list' to have two items in the internal
linked list with the same key and it will not cancel the newest one
with the active timer. This would cause the use after free syndrome
like this trace is showing.

I can make a patch, but have no way to know if it is any good :\

Jason