Re: [PATCH v2 1/2] i2c: tegra: Better handle case where CPU0 is busy for a long time

From: Thierry Reding
Date: Wed Apr 29 2020 - 12:30:26 EST


On Wed, Apr 29, 2020 at 03:35:26PM +0300, Dmitry Osipenko wrote:
> 29.04.2020 11:55, Thierry Reding ÐÐÑÐÑ:
> ...
> >>> It's not "papering over an issue". The bug can't be fixed properly
> >>> without introducing I2C atomic transfers support for a late suspend
> >>> phase, I don't see any other solutions for now. Stable kernels do not
> >>> support atomic transfers at all, that proper solution won't be backportable.
> >>
> >> Hm... on a hunch I tried something and, lo and behold, it worked. I can
> >> get Cardhu to properly suspend/resume on top of v5.7-rc3 with the
> >> following sequence:
> >>
> >> revert 9f42de8d4ec2 i2c: tegra: Fix suspending in active runtime PM state
> >> apply http://patchwork.ozlabs.org/project/linux-tegra/patch/20191213134417.222720-1-thierry.reding@xxxxxxxxx/
> >>
> >> I also ran that through our test farm and I don't see any other issues.
> >> At the time I was already skeptical about pm_runtime_force_suspend() and
> >> pm_runtime_force_resume() and while I'm not fully certain why exactly it
> >> doesn't work, the above on top of v5.7-rc3 seems like a good option.
> >>
> >> I'll try to do some digging if I can find out why exactly force suspend
> >> and resume doesn't work.
> >
> > Ah... so it looks like pm_runtime_force_resume() never actually does
> > anything in this case and then disable_depth remains at 1 and the first
> > tegra_i2c_xfer() will then fail to runtime resume the controller.
>
> That's the exactly expected behaviour of the RPM force suspend/resume.
> The only unexpected part for me is that the tegra_i2c_xfer() runtime
> resume then fails in the NOIRQ phase.
>
> Anyways, again, today it's wrong to use I2C in the NOIRQ phase becasue
> I2C interrupt is disabled. It's the PCIe driver that should be fixed.

I don't think so. Everything works perfectly fine if we fix system
suspend/resume not to rely on pm_runtime_force_{suspend,resume}() and
directly call the runtime suspend/resume implementations.

So can we please stop deflecting and fix the damn I2C driver. From my
perspective we have two choices:

1) do what I suggested above and revert the force suspend/resume patch
and apply the "manual" suspend/resume patch instead

2) revert this patch and go back to the drawing board

I suspect that with 2) we'd end up back where we started and have to do
1) anyway.

An alternative that I'd prefer even more would be to do 2) now for v5.7
and then we do 1) for v5.8 and give this some more soaking time.

Thierry

Attachment: signature.asc
Description: PGP signature