Re: KASAN: use-after-free Read in ucma_close (2)

From: Jason Gunthorpe
Date: Fri Sep 11 2020 - 13:03:15 EST


On Fri, Sep 11, 2020 at 11:20:17PM +0800, Hillf Danton wrote:
>
> On Fri, 11 Sep 2020 08:57:50 -0300 Jason Gunthorpe wrote:
> > On Fri, Sep 11, 2020 at 12:16:40PM +0800, Hillf Danton wrote:
> > > Detect race destroying ctx in order to avoid UAF.
> > >
> > > +++ b/drivers/infiniband/core/ucma.c
> > > @@ -625,6 +625,10 @@ static ssize_t ucma_destroy_id(struct uc
> > > return PTR_ERR(ctx);
> > >
> > > mutex_lock(&ctx->file->mut);
> > > + if (ctx->destroying == 1) {
> > > + mutex_unlock(&ctx->file->mut);
> > > + return -ENXIO;
> > > + }
> > > ctx->destroying = 1;
> > > mutex_unlock(&ctx->file->mut);
> > >
> > > @@ -1826,6 +1830,8 @@ static int ucma_close(struct inode *inod
> > >
> > > mutex_lock(&file->mut);
> > > list_for_each_entry_safe(ctx, tmp, &file->ctx_list, list) {
> > > + if (ctx->destroying == 1)
> > > + continue;
> > > ctx->destroying = 1;
> > > mutex_unlock(&file->mut);
> > >
> >
> > ucma_destroy_id() is called from write() and ucma_close is release(),
> > so there is no way these can race?
>
> Sound good but what's reported is uaf in the close path, which is
> impossible without another thread releasing the ctx a step ahead
> the closer.
> Can we call it a race if that's true?

Migrate is the cause, very tricky:

CPU0 CPU1
ucma_destroy_id()
ucma_migrate_id()
ucma_get_ctx()
xa_lock()
_ucma_find_context()
xa_erase()
xa_lock()
ctx->file = new_file
list_move()
xa_unlock()
ucma_put_ctx
ucma_close()
_destroy_id()

_destroy_id()
wait_for_completion()
// boom


ie the destrory_id() on the initial FD captures the ctx right before
migrate moves it, then the new FD closes calling destroy while the
other destroy is still running.

Sigh, I will rewrite migrate too..

Jason