Re: [PATCH v2 0/4] Enlarge tracepoints in the display component

From: Daniel Vetter
Date: Thu Sep 17 2020 - 07:40:40 EST


On Wed, Sep 16, 2020 at 11:27:27AM -0400, Kazlauskas, Nicholas wrote:
> On 2020-09-16 5:12 a.m., Daniel Vetter wrote:
> > On Fri, Sep 11, 2020 at 10:59:23AM -0400, Rodrigo Siqueira wrote:
> > > Debug issues related to display can be a challenge due to the complexity
> > > around this topic and different source of information might help in this
> > > process. We already have support for tracepoints inside the display
> > > component, i.e., we have the basic functionalities available and we just
> > > need to expand it in order to make it more valuable for debugging. For
> > > this reason, this patchset reworks part of the current tracepoint
> > > options and add different sets of tracing inside amdgpu_dm, display
> > > core, and DCN10. The first patch of this series just rework part of the
> > > current tracepoints and the last set of patches introduces new
> > > tracepoints.
> > >
> > > This first patchset version is functional. Please, let me know what I
> > > can improve in the current version but also let me know what kind of
> > > tracepoint I can add for the next version.
> > >
> > > Finally, I want to highlight that this work is based on a set of patches
> > > originally made by Nicholas Kazlauskas.
> > >
> > > Change in V2:
> > > - I added another patch for capturing the clock state for different display
> > > architecture.
> >
> > Hm I'm not super sure tracepoints for state dumping are the right thing
> > here. We kinda have the atomic state dumping code with all the various
> > callbacks, and you can extend that pretty easily. Gives you full state
> > dump in debugfs, plus a few function to dump into dmesg.
> >
> > Maybe what we need is a function to dump this also into printk tracepoint
> > (otoh with Sean Paul's tracepoint work we'd get that through the dmesg
> > stuff already), and then you could do it there?
> >
> > Upside is that for customers they'd get a much more consistent way to
> > debug display issues across different drivers.
> >
> > For low-level hw debug what we do is give the hw guys an mmio trace, and
> > they replay it on the fancy boxes :-) So for that I think this here is
> > again too high level, but maybe what you have is a bit different.
> > -Daniel
>
> We have raw register traces, but what I find most useful is to be able to
> see are the incoming DRM IOCTLs, objects and properties per commit.
>
> Many of the bugs we see in display code is in the conversion from DRM -> DM
> -> DC state. The current HW state is kind of useless in most cases, but the
> sequence helps track down intermittent problems and understand state
> transitions.
>
> Tracepoints provide everything I really need to be able to track down these
> problems without falling back to a full debugger. The existing DRM prints
> (even at high logging levels) aren't enough to understand what's going on in
> most cases in our driver so funneling those into tracepoints to improve perf
> doesn't really help that much.
>
> I think this kind of idea was rejected for DRM core last year with Sean's
> patch series but if we can't get them into core then I'd like to get them
> into our driver at least. These are a cleaned up version of Sean's work + my
> work that I end up applying locally whenever I debug something.

Nah, Sean's series wasn't rejected. It's simply stuck waiting for review.
So if your goal is to get better dumping going on, I think combining this
with Sean's work (and getting that reviewed), plus then tapping into the
atomic state dumping code. Then you know what was requested, plus what your
atomic_check code computed should be the hw state, and you can compare
that with the register dumps you already grab.

Feels at least like a more complete and flexible solution than ad-hoc
tracepoints for debuggin in each driver. The idea behind Sean's work is
also that we'd have a blackbox recorder for any drm issues which distros
in the field could use. So driver doing their own debug output doesn't
sound super great.

I think Siqueira already chatted a bit with Sean.
-Daniel

>

> Regards,
> Nicholas Kazlauskas
>
> >
> > >
> > > Rodrigo Siqueira (4):
> > > drm/amd/display: Rework registers tracepoint
> > > drm/amd/display: Add tracepoint for amdgpu_dm
> > > drm/amd/display: Add pipe_state tracepoint
> > > drm/amd/display: Add tracepoint for capturing clocks state
> > >
> > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +
> > > .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 712 +++++++++++++++++-
> > > .../dc/clk_mgr/dce112/dce112_clk_mgr.c | 5 +
> > > .../display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 4 +
> > > .../display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 4 +
> > > .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 4 +
> > > .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 +
> > > drivers/gpu/drm/amd/display/dc/core/dc.c | 11 +
> > > .../gpu/drm/amd/display/dc/dce/dce_clk_mgr.c | 5 +
> > > .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 17 +-
> > > 10 files changed, 747 insertions(+), 36 deletions(-)
> > >
> > > --
> > > 2.28.0
> > >
> >
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch