Re: [Intel-gfx] [PATCH v7 0/9] dyndbg: drm.debug adaptation

From: Ville Syrjälä
Date: Mon Oct 31 2022 - 09:08:33 EST


On Sun, Oct 30, 2022 at 08:42:52AM -0600, jim.cromie@xxxxxxxxx wrote:
> On Thu, Oct 27, 2022 at 2:10 PM Ville Syrjälä
> <ville.syrjala@xxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Oct 27, 2022 at 01:55:39PM -0600, jim.cromie@xxxxxxxxx wrote:
> > > On Thu, Oct 27, 2022 at 9:59 AM Ville Syrjälä
> > > <ville.syrjala@xxxxxxxxxxxxxxx> wrote:
> > > >
> > > > On Thu, Oct 27, 2022 at 09:37:52AM -0600, jim.cromie@xxxxxxxxx wrote:
> > > > > On Thu, Oct 27, 2022 at 9:08 AM Jason Baron <jbaron@xxxxxxxxxx> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 10/21/22 05:18, Jani Nikula wrote:
> > > > > > > On Thu, 20 Oct 2022, Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> wrote:
> > > > > > >> On Sat, Sep 24, 2022 at 03:02:34PM +0200, Greg KH wrote:
> > > > > > >>> On Sun, Sep 11, 2022 at 11:28:43PM -0600, Jim Cromie wrote:
> > > > > > >>>> hi Greg, Dan, Jason, DRM-folk,
> > > > > > >>>>
> > > > > > >>>> heres follow-up to V6:
> > > > > > >>>> rebased on driver-core/driver-core-next for -v6 applied bits (thanks)
> > > > > > >>>> rework drm_debug_enabled{_raw,_instrumented,} per Dan.
> > > > > > >>>>
> > > > > > >>>> It excludes:
> > > > > > >>>> nouveau parts (immature)
> > > > > > >>>> tracefs parts (I missed --to=Steve on v6)
> > > > > > >>>> split _ddebug_site and de-duplicate experiment (way unready)
> > > > > > >>>>
> > > > > > >>>> IOW, its the remaining commits of V6 on which Dan gave his Reviewed-by.
> > > > > > >>>>
> > > > > > >>>> If these are good to apply, I'll rebase and repost the rest separately.
> > > > > > >>>
> > > > > > >>> All now queued up, thanks.
> > > > > > >>
> > > > > > >> This stuff broke i915 debugs. When I first load i915 no debug prints are
> > > > > > >> produced. If I then go fiddle around in /sys/module/drm/parameters/debug
> > > > > > >> the debug prints start to suddenly work.
> > > > > > >
> > > > > > > Wait what? I always assumed the default behaviour would stay the same,
> > > > > > > which is usually how we roll. It's a regression in my books. We've got a
> > > > > > > CI farm that's not very helpful in terms of dmesg logging right now
> > > > > > > because of this.
> > > > > > >
> > > > > > > BR,
> > > > > > > Jani.
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > That doesn't sound good - so you are saying that prior to this change some
> > > > > > of the drm debugs were default enabled. But now you have to manually enable
> > > > > > them?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > -Jason
> > > > >
> > > > >
> > > > > Im just seeing this now.
> > > > > Any new details ?
> > > >
> > > > No. We just disabled it as BROKEN for now. I was just today thinking
> > > > about sending that patch out if no solutin is forthcoming soon since
> > > > we need this working before 6.1 is released.
> > > >
> > > > Pretty sure you should see the problem immediately with any driver
> > > > (at least if it's built as a module, didn't try builtin). Or at least
> > > > can't think what would make i915 any more special.
> > > >
> > >
> > > So, I should note -
> > > 99% of my time & energy on this dyndbg + drm patchset
> > > has been done using virtme,
> > > so my world-view (and dev-hack-test env) has been smaller, simpler
> > > maybe its been fatally simplistic.
> > >
> > > ive just rebuilt v6.0 (before the trouble)
> > > and run it thru my virtual home box,
> > > I didnt see any unfamiliar drm-debug output
> > > that I might have inadvertently altered somehow
> > >
> > > I have some real HW I can put a reference kernel on,0
> > > to look for the missing output, but its all gonna take some time,
> > > esp to tighten up my dev-test-env
> > >
> > > in the meantime, there is:
> > >
> > > config DRM_USE_DYNAMIC_DEBUG
> > > bool "use dynamic debug to implement drm.debug"
> > > default y
> > > depends on DRM
> > > depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE
> > > depends on JUMP_LABEL
> > > help
> > > Use dynamic-debug to avoid drm_debug_enabled() runtime overheads.
> > > Due to callsite counts in DRM drivers (~4k in amdgpu) and 56
> > > bytes per callsite, the .data costs can be substantial, and
> > > are therefore configurable.
> > >
> > > Does changing the default fix things for i915 dmesg ?
> >
> > I think we want to mark it BROKEN in addition to make sure no one
>
> Ok, I get the distinction now.
> youre spelling that
> depends on BROKEN
>
> I have a notional explanation, and a conflating commit:
>
> can you eliminate
> git log -p ccc2b496324c13e917ef05f563626f4e7826bef1
>
> as the cause ?

Reverting that doesn't help.

>
>
>
> commit ccc2b496324c13e917ef05f563626f4e7826bef1
> Author: Jim Cromie <jim.cromie@xxxxxxxxx>
> Date: Sun Sep 11 23:28:51 2022 -0600
>
> drm_print: prefer bare printk KERN_DEBUG on generic fn
>
> drm_print.c calls pr_debug() just once, from __drm_printfn_debug(),
> which is a generic/service fn. The callsite is compile-time enabled
> by DEBUG in both DYNAMIC_DEBUG=y/n builds.
>
> For dyndbg builds, reverting this callsite back to bare printk is
> correcting a few anti-features:
>
> 1- callsite is generic, serves multiple drm users.
> it is soft-wired on currently by #define DEBUG
> could accidentally: #> echo -p > /proc/dynamic_debug/control
>
> 2- optional "decorations" by dyndbg are unhelpful/misleading here,
> they describe only the generic site, not end users
>
> IOW, 1,2 are unhelpful at best, and possibly confusing.
>
>
> This shouldnt have turned off any debug of any kind
> (drm.debug nor plain pr_debug)
>
> but that former callsite no longer does the modname:func:line prefixing
> that could have been in effect and relied upon (tested for) by your CI
>
>
> I do need to clarify, I dont know exactly what debug/logging output
> is missing such that CI is failing

CI isn't failing. But any logs it produces are 100% useless,
as are any user reported logs.

The debugs that are missing are anything not coming directly
from drm.ko.

The stuff that I see being printed by i915.ko are drm_info()
and the drm_printer stuff from i915_welcome_messages(). That
also implies that drm_debug_enabled(DRM_UT_DRIVER) does at
least still work correctly.

I suspect that the problem is just that the debug calls
aren't getting patched in when a module loads. And fiddling
with the modparam after the fact does trigger that somehow.

--
Ville Syrjälä
Intel