Re: [PATCH 1/2] drm/msm/dp: Clean up handling of DP AUX interrupts

From: Doug Anderson
Date: Thu Jan 26 2023 - 20:17:44 EST


Hi,

On Wed, Jan 25, 2023 at 9:13 AM Kuogee Hsieh <quic_khsieh@xxxxxxxxxxx> wrote:
>
>
> On 1/19/2023 2:53 PM, Douglas Anderson wrote:
> > The DP AUX interrupt handling was a bit of a mess.
> > * There were two functions (one for "native" transfers and one for
> > "i2c" transfers) that were quite similar. It was hard to say how
> > many of the differences between the two functions were on purpose
> > and how many of them were just an accident of how they were coded.
> > * Each function sometimes used "else if" to test for error bits and
> > sometimes didn't and again it was hard to say if this was on purpose
> > or just an accident.
> > * The two functions wouldn't notice whether "unknown" bits were
> > set. For instance, there seems to be a bit "DP_INTR_PLL_UNLOCKED"
> > and if it was set there would be no indication.
> > * The two functions wouldn't notice if more than one error was set.
> >
> > Let's fix this by being more consistent / explicit about what we're
> > doing.
> >
> > By design this could cause different handling for AUX transfers,
> > though I'm not actually aware of any bug fixed as a result of
> > this patch (this patch was created because we simply noticed how odd
> > the old code was by code inspection). Specific notes here:
> > 1. In the old native transfer case if we got "done + wrong address"
> > we'd ignore the "wrong address" (because of the "else if"). Now we
> > won't.
> > 2. In the old native transfer case if we got "done + timeout" we'd
> > ignore the "timeout" (because of the "else if"). Now we won't.
> > 3. In the old native transfer case we'd see "nack_defer" and translate
> > it to the error number for "nack". This differed from the i2c
> > transfer case where "nack_defer" was given the error number for
> > "nack_defer". This 100% can't matter because the only user of this
> > error number treats "nack defer" the same as "nack", so it's clear
> > that the difference between the "native" and "i2c" was pointless
> > here.
> > 4. In the old i2c transfer case if we got "done" plus any error
> > besides "nack" or "defer" then we'd ignore the error. Now we don't.
> > 5. If there is more than one error signaled by the hardware it's
> > possible that we'll report a different one than we used to. I don't
> > know if this matters. If someone is aware of a case this matters we
> > should document it and change the code to make it explicit.
> > 6. One quirk we keep (I don't know if this is important) is that in
> > the i2c transfer case if we see "done + defer" we report that as a
> > "nack". That seemed too intentional in the old code to just drop.
> >
> > After this change we will add extra logging, including:
> > * A warning if we see more than one error bit set.
> > * A warning if we see an unexpected interrupt.
> > * A warning if we get an AUX transfer interrupt when shouldn't.
> >
> > It actually turns out that as a result of this change then at boot we
> > sometimes see an error:
> > [drm:dp_aux_isr] *ERROR* Unexpected DP AUX IRQ 0x01000000 when not busy
> > That means that, during init, we are seeing DP_INTR_PLL_UNLOCKED. For
> > now I'm going to say that leaving this error reported in the logs is
> > OK-ish and hopefully it will encourage someone to track down what's
> > going on at init time.
> >
> > One last note here is that this change renames one of the interrupt
> > bits. The bit named "i2c done" clearly was used for native transfers
> > being done too, so I renamed it to indicate this.
> >
> > Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> > ---
> > I don't have good test coverage for this change and it does have the
> > potential to change behavior. I confirmed that eDP and DP still
> > continue to work OK on one machine. Hopefully folks can test it more.
> >
> > drivers/gpu/drm/msm/dp/dp_aux.c | 80 ++++++++++++-----------------
> > drivers/gpu/drm/msm/dp/dp_catalog.c | 2 +-
> > drivers/gpu/drm/msm/dp/dp_catalog.h | 2 +-
> > 3 files changed, 36 insertions(+), 48 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> > index cc3efed593aa..34ad08ae6eb9 100644
> > --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> > +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> > @@ -162,47 +162,6 @@ static ssize_t dp_aux_cmd_fifo_rx(struct dp_aux_private *aux,
> > return i;
> > }
> >
> > -static void dp_aux_native_handler(struct dp_aux_private *aux, u32 isr)
> > -{
> > - if (isr & DP_INTR_AUX_I2C_DONE)
> > - aux->aux_error_num = DP_AUX_ERR_NONE;
> > - else if (isr & DP_INTR_WRONG_ADDR)
> > - aux->aux_error_num = DP_AUX_ERR_ADDR;
> > - else if (isr & DP_INTR_TIMEOUT)
> > - aux->aux_error_num = DP_AUX_ERR_TOUT;
> > - if (isr & DP_INTR_NACK_DEFER)
> > - aux->aux_error_num = DP_AUX_ERR_NACK;
> > - if (isr & DP_INTR_AUX_ERROR) {
> > - aux->aux_error_num = DP_AUX_ERR_PHY;
> > - dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> > - }
> > -}
> > -
> > -static void dp_aux_i2c_handler(struct dp_aux_private *aux, u32 isr)
> > -{
> > - if (isr & DP_INTR_AUX_I2C_DONE) {
> > - if (isr & (DP_INTR_I2C_NACK | DP_INTR_I2C_DEFER))
> > - aux->aux_error_num = DP_AUX_ERR_NACK;
> > - else
> > - aux->aux_error_num = DP_AUX_ERR_NONE;
> > - } else {
> > - if (isr & DP_INTR_WRONG_ADDR)
> > - aux->aux_error_num = DP_AUX_ERR_ADDR;
> > - else if (isr & DP_INTR_TIMEOUT)
> > - aux->aux_error_num = DP_AUX_ERR_TOUT;
> > - if (isr & DP_INTR_NACK_DEFER)
> > - aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> > - if (isr & DP_INTR_I2C_NACK)
> > - aux->aux_error_num = DP_AUX_ERR_NACK;
> > - if (isr & DP_INTR_I2C_DEFER)
> > - aux->aux_error_num = DP_AUX_ERR_DEFER;
> > - if (isr & DP_INTR_AUX_ERROR) {
> > - aux->aux_error_num = DP_AUX_ERR_PHY;
> > - dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> > - }
> > - }
> > -}
> > -
> > static void dp_aux_update_offset_and_segment(struct dp_aux_private *aux,
> > struct drm_dp_aux_msg *input_msg)
> > {
> > @@ -427,13 +386,42 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
> > if (!isr)
> > return;
> >
> > - if (!aux->cmd_busy)
> > + if (!aux->cmd_busy) {
> > + DRM_ERROR("Unexpected DP AUX IRQ %#010x when not busy\n", isr);
> > return;
> > + }
> >
> > - if (aux->native)
> > - dp_aux_native_handler(aux, isr);
> > - else
> > - dp_aux_i2c_handler(aux, isr);
> > + /*
> > + * The logic below assumes only one error bit is set (other than "done"
> > + * which can apparently be set at the same time as some of the other
> > + * bits). Warn if more than one get set so we know we need to improve
> > + * the logic.
> > + */
> > + if (hweight32(isr & ~DP_INTR_AUX_XFER_DONE) > 1)
> > + DRM_WARN("Some DP AUX interrupts unhandled: %#010x\n", isr);
> > +
> > + if (isr & DP_INTR_AUX_ERROR) {
> > + aux->aux_error_num = DP_AUX_ERR_PHY;
> > + dp_catalog_aux_clear_hw_interrupts(aux->catalog);
> > + } else if (isr & DP_INTR_NACK_DEFER) {
> > + aux->aux_error_num = DP_AUX_ERR_NACK_DEFER;
> > + } else if (isr & DP_INTR_WRONG_ADDR) {
> > + aux->aux_error_num = DP_AUX_ERR_ADDR;
> > + } else if (isr & DP_INTR_TIMEOUT) {
> > + aux->aux_error_num = DP_AUX_ERR_TOUT;
> > + } else if (isr & DP_INTR_AUX_XFER_DONE) {
> > + aux->aux_error_num = DP_AUX_ERR_NONE;
>
>
> 1) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_NACK are set
>
> 2) both DP_INTR_AUX_XFER_DONE and DP_INTR_I2C_DEFER are set
>
> with above two condition, below two "else if" will not be reached since
> DP_INTR_AUX_XFER_DONE is check with higher priority

Indeed, that is a bug, good catch! I had the "DONE" at the end at the
beginning but then I remember thinking it looked ugly because the two
I2C cases below had the extra "aux->native". Moved it to the right
place and I'll send a quick v2 since I don't expect more feedback
since it's already been a week.

-Doug