Re: Regression with mainline kernel on rpi4
From: Daniel Vetter
Date: Fri Sep 24 2021 - 18:50:32 EST
On Fri, Sep 24, 2021 at 3:30 PM Maxime Ripard <maxime@xxxxxxxxxx> wrote:
> On Wed, Sep 22, 2021 at 01:25:21PM -0700, Linus Torvalds wrote:
> > On Wed, Sep 22, 2021 at 1:19 PM Sudip Mukherjee
> > <sudipm.mukherjee@xxxxxxxxx> wrote:
> > >
> > > I added some debugs to print the addresses, and I am getting:
> > > [ 38.813809] sudip crtc 0000000000000000
> > >
> > > This is from struct drm_crtc *crtc = connector->state->crtc;
> > Yeah, that was my personal suspicion, because while the line number
> > implied "crtc->state" being NULL, the drm data structure documentation
> > and other drivers both imply that "crtc" was the more likely one.
> > I suspect a simple
> > if (!crtc)
> > return;
> > in vc4_hdmi_set_n_cts() is at least part of the fix for this all, but
> > I didn't check if there is possibly something else that needs to be
> > done too.
> Thanks for the decode_stacktrace.sh and the follow-up
> Yeah, it looks like we have several things wrong here:
> * we only check that connector->state is set, and not
> connector->state->crtc indeed.
> * We also check only in startup(), so at open() and not later on when
> the sound streaming actually start. This has been there for a while,
> so I guess it's never really been causing a practical issue before.
You also have no locking, plus looking at ->state objects outside of
atomic commit machinery makes no sense because you're not actually in
sync with the hw state. Relevant bits need to be copied over at commit
time, protected by some spinlock (and that spinlock also needs to be
held over whatever other stuff you're setting to make sure we don't
get a funny out-of-sync state anywhere).
Liberally sprinkling a few NULL checks here doesn't fix much at all,
it only papers over design bugs in the code.
> I'm still not entirely sure how we can end up in that situation though.
> The only case I could think of is that:
> * The firmware enables the HDMI controller, then boots Linux
> * The driver starts, registers its audio card. connector->state is
> NULL then, and if the HDMI monitor is actually an HDMI monitor (vs a
> DVI monitor), the VC4_HDMI_RAM_PACKET_ENABLE bit that we test in
> startup will be set.
> * The driver will create the connector->state (through a call to
> drm_mode_config_reset in vc4_kms_load), connector->state isn't NULL
> anymore, VC4_HDMI_RAM_PACKET_ENABLE is still set.
> * The driver then disables the HDMI controller (in
> vc4_crtc_disable_at_boot) but never clears the
> VC4_HDMI_RAM_PACKET_ENABLE bit.
> * Pulseaudio opens the audio device, startup succeeds because both
> conditions we test succeed.
> * However, since we either never enabled the HDMI connector (or if it
> was disabled at some point), connector->state->crtc is NULL and we
> get our NULL pointer dereference.
> The Ubuntu configuration has the framebuffer emulation and the
> framebuffer console enabled, so it's likely to be enabled and
> something (X.org?) comes along and disables the connector right when
> pulseaudio calls prepare().
Software Engineer, Intel Corporation