Re: [syzbot] KASAN: use-after-free Read in addr_handler (4)

From: Jason Gunthorpe
Date: Thu Sep 16 2021 - 12:02:30 EST


On Thu, Sep 16, 2021 at 04:45:27PM +0200, Dmitry Vyukov wrote:

> It looks like a very hard to trigger race (few crashes, no reproducer,
> but KASAN reports look sensible). That's probably the reason syzkaller
> can't create a reproducer.
> From the log it looks like it was triggered by one of these programs
> below. But I tried to reproduce manually and had no success.
> We are currently doing some improvements to race triggering code in
> syzkaller, and may try to use this as a litmus test to see if
> syzkaller will do any better:
> https://github.com/google/syzkaller/issues/612#issuecomment-920961538

I would suggest to look at this:

https://patchwork.kernel.org/project/linux-rdma/patch/0-v1-9fbb33f5e201+2a-cma_listen_jgg@xxxxxxxxxx/

Which I think should be completely deterministic, just do the RDMA_CM
ops in the right order, but syzbot didn't find a reproducer.

The "healer" fork did however:

https://lore.kernel.org/all/CACkBjsY-CNzO74XGo0uJrcaZTubC+Yw9Sg1bNNi+evUOGaZTCg@xxxxxxxxxxxxxx/#r

> Answering your question re what was running concurrently with what.
> Each of the syscalls in these programs can run up to 2 times and
> ultimately any of these calls can race with any. Potentially syzkaller
> can predict values kernel will return (e.g. id's) before kernel
> actually returned them. I guess this does not restrict search area for
> the bug a lot...

Well, it does help if it is only those system calls

And I think I can discount the workqueue as a problem as I'd expect a
kasn hit on the 'req' allocation if the workqueue was malfunctioning -
thus I must conclude we are not calling work cancelation for some
reason.

Jason