Re: [git pull] drm fixes

From: Daniel Vetter
Date: Mon Mar 02 2015 - 04:03:15 EST


On Sun, Mar 01, 2015 at 05:59:53PM -0800, Linus Torvalds wrote:
> On Sun, Mar 1, 2015 at 1:00 PM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Back to the drawing board.
>
> Ok, many hours later, but I found it.
>
> The bisection was a disaster, having to work around other bugs in this
> area, but it ended up getting "close enough" that I figured out what
> went wrong.

Sorry for all the bisect work. Ville dug as far as the load detect being
unhappy due to atomic last week. But since I general don't real mail on
w/e I've only seen this thread now.

> The "intel_plane_duplicate_state()" is horribly horribly buggy. It
> looks at the state->fb pointer, but it may have been free'd already.
>
> This workaround "works for me", but it's really still very
> questionable, because while the "kref_get_unless_zero()" works
> correctly when the last reference has been dropped, I'm not sure that
> there is any guarantee that the whole allocation even exists any more,
> so I think the *correct* thing to do would be to clear state->fb when
> dropping the kref. But this was the smallest working patch I could
> come up with. Somebody who actually knows the code should start
> looking at the places that do drm_framebuffer_unreference(), and
> actually clear that pointer instead.
>
> Added Matt Roper and Ander Conselvan de Oliveira to the discussion,
> since they are the ones git says are involved with the original broken
> intel_plane_duplicate_state().
>
> Anyway, attached is
>
> (a) the patch with a big comment
>
> (b) the warnings I get on that machine that show where this problem
> triggers (and another warning earlier).
>
> Comments? I'm sure this probably only triggers with *old* X servers
> that don't do all the modern dri stuff.

It's not old X servers but old machines which don't have hotplug
interrupts (or still have tv-out, which doesn't have hpd support
anywhere).

I'll dig into this now just fyi the rules about how plane->state should be
handled:
- you need plane->lock
- it's invariant once assigned, updates should only be done with a
duplicate_state(), update state you want to change and the going through
the atomic commit machinery
- drm_plane_state->fb should always be a full reference

But switching to atomic amounts to a full rewrite of the drm core and
drivers (semantics for modeset updates are totally different). So there's
lots of glue and ducttape going on to keep up the illusion for both old
code and partially transitioned driver. It's all a bit wobbly atm and
don't loook too closely at some of the hacks we have.

And can you please attach a bactrace of the WARN in your patch, just to
double-check you blow up at the same spot?

Thanks, Daniel

> Linus

> From c182b15c3abee75cdc9d9564b6ab826403690f4e Mon Sep 17 00:00:00 2001
> From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxxx>
> Date: Sat, 28 Feb 2015 21:44:48 -0800
> Subject: [PATCH] Workaround for drm bug
>
> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>
> ---
> drivers/gpu/drm/i915/intel_atomic_plane.c | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c b/drivers/gpu/drm/i915/intel_atomic_plane.c
> index 9e6f727..72714d3 100644
> --- a/drivers/gpu/drm/i915/intel_atomic_plane.c
> +++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
> @@ -85,8 +85,23 @@ intel_plane_duplicate_state(struct drm_plane *plane)
> return NULL;
>
> state = &intel_state->base;
> - if (state->fb)
> - drm_framebuffer_reference(state->fb);
> +
> + /*
> + * We cannot do drm_framebuffer_reference(), because the reference
> + * may already have been dropped.
> + *
> + * So we do what drm_framebuffer_lookup() does, namely do a
> + * kref_get_unless_zero(). Even that is somewhat questionable,
> + * in that maybe the 'fb' already got free'd. So warn loudly
> + * about this.
> + *
> + * Maybe the base.fb should be cleared by whatever drops the
> + * reference?
> + */
> + if (state->fb && !kref_get_unless_zero(&state->fb->refcount)) {
> + state->fb = NULL;
> + WARN_ONCE(1, "intel_plane_duplicate_state got plane with dead frame buffer");
> + }
>
> return state;
> }
> --
> 2.3.1.167.g7f4ba4b
>


> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/