Re: [PATCH] drm/ttm: fix error handling in ttm_bo_handle_move_mem()

From: Dan Carpenter
Date: Wed Jun 16 2021 - 04:38:31 EST


On Wed, Jun 16, 2021 at 08:46:33AM +0200, Christian König wrote:
> Sending the first message didn't worked, so let's try again.
>
> Am 16.06.21 um 08:30 schrieb Dan Carpenter:
> > There are three bugs here:
> > 1) We need to call unpopulate() if ttm_tt_populate() succeeds.
> > 2) The "new_man = ttm_manager_type(bdev, bo->mem.mem_type);" assignment
> > was wrong and it was really assigning "new_mem = old_mem;". There
> > is no need for this assignment anyway as we already have the value
> > for "new_mem".
> > 3) The (!new_man->use_tt) condition is reversed.
> >
> > Fixes: ba4e7d973dd0 ("drm: Add the TTM GPU memory manager subsystem.")
> > Signed-off-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
> > ---
> > This is from reading the code and I can't swear that I have understood
> > it correctly. My nouveau driver is currently unusable and this patch
> > has not helped. But hopefully if I fix enough bugs eventually it will
> > start to work.
>
> Well NAK, the code previously looked quite well and you are breaking it now.
>
> What's the problem with nouveau?
>

The new Firefox seems to excersize nouveau more than the old one so
when I start 10 firefox windows it just hangs the graphics.

I've added debug code and it seems like the problem is that
nv50_mem_new() is failing.


> > drivers/gpu/drm/ttm/ttm_bo.c | 14 ++++++++------
> > 1 file changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> > index ebcffe794adb..72dde093f754 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > @@ -180,12 +180,12 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
> > */
> > ret = ttm_tt_create(bo, old_man->use_tt);
> > if (ret)
> > - goto out_err;
> > + return ret;
> > if (mem->mem_type != TTM_PL_SYSTEM) {
> > ret = ttm_tt_populate(bo->bdev, bo->ttm, ctx);
> > if (ret)
> > - goto out_err;
> > + goto err_destroy;
> > }
> > }
> > @@ -193,15 +193,17 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
> > if (ret) {
> > if (ret == -EMULTIHOP)
> > return ret;
> > - goto out_err;
> > + goto err_unpopulate;
> > }
> > ctx->bytes_moved += bo->base.size;
> > return 0;
> > -out_err:
> > - new_man = ttm_manager_type(bdev, bo->mem.mem_type);
>
> This here switches new and old manager. E.g. the new_man is now pointing to
> the existing resource manager.

Why not just use "old_man" instead of basically the equivalent to
"new_man = old_man"? Can the old_man change part way through the
function?

regards,
dan carpenter