Re: [PATCH v2] i2c: omap: Fix standard mode false ACK readings

From: Andreas Kemnade
Date: Fri Sep 13 2024 - 09:33:12 EST


Am Fri, 13 Sep 2024 14:28:59 +0200
schrieb "H. Nikolaus Schaller" <hns@xxxxxxxxxxxxx>:

> Hi,
>
>
> > Am 13.09.2024 um 14:09 schrieb Andreas Kemnade
> > <andreas@xxxxxxxxxxxx>:
> >
> > Am Wed, 11 Sep 2024 11:40:04 +0200
> > schrieb "H. Nikolaus Schaller" <hns@xxxxxxxxxxxxx>:
> >
> >> Hi,
> >>
> >>> Am 28.04.2023 um 20:30 schrieb Reid Tonking <reidt@xxxxxx>:
> >>>
> >>> On 10:43-20230428, Tony Lindgren wrote:
> >>>> * Raghavendra, Vignesh <vigneshr@xxxxxx> [230427 13:18]:
> >>>>> On 4/27/2023 1:19 AM, Reid Tonking wrote:
> >>>>>> Using standard mode, rare false ACK responses were appearing
> >>>>>> with i2cdetect tool. This was happening due to NACK interrupt
> >>>>>> triggering ISR thread before register access interrupt was
> >>>>>> ready. Removing the NACK interrupt's ability to trigger ISR
> >>>>>> thread lets register access ready interrupt do this instead.
> >>>>>>
> >>>>
> >>>> So is it safe to leave NACK interrupt unhandled until we get the
> >>>> next interrupt, does the ARDY always trigger after hitting this?
> >>>>
> >>>> Regards,
> >>>>
> >>>> Tony
> >>>
> >>> Yep, the ARDY always gets set after a new command when register
> >>> access is ready so there's no need for NACK interrupt to control
> >>> this.
> >>
> >> I have tested one GTA04A5 board where this patch breaks boot on
> >> v4.19.283 or v6.11-rc7 (where it was inherited from some earlier
> >> -rc series).
> >>
> >> The device is either stuck with no signs of activity or reports RCU
> >> stalls after a 20 second pause.
> >>
> > cannot reproduce it here.
>
> That is good for you :)
>
which does not mean that there is no problem...

> > I had a patch to disable 1Ghz on that
> > device in my tree. Do you have anything strange in your
> > tree?
>
> No, and the omap3 is running with 800 MHz only.
>
So you have a patch disabling 1Ghz OPP in there? The error messages
look like things I got when 1Ghz was enabled, so better double check.

if it is letux, then there is e.g. the interrupt reversal in there.
Maybe it unveils some problem which should be fixed, maybe it is
harmful, it was never well reviewed...

> I haven't tested on another board but the bug is very reproducible
> and I was able to bisect it to this patch, which makes the difference.
>
the error messages, esp. regarding rcu do not look so related to this.
Maybe having this patch or not triggers some other bug. Maybe we trigger
some race conditions. Or i2c error checking regarding OPP setting...

> So there may be boards which happily run with the patch and some
> don't. Maybe a race condition with hardware.
>
I am not ruling out that this patch has nasty side effects but I think
there is more in the game.

Regards,
Andreas