Re: [PATCH v2 1/2] i2c: tegra: Better handle case where CPU0 is busy for a long time
From: Dmitry Osipenko
Date: Mon May 04 2020 - 16:55:17 EST
04.05.2020 18:42, Thierry Reding ÐÐÑÐÑ:
> On Sat, May 02, 2020 at 05:40:35PM +0300, Dmitry Osipenko wrote:
>> 27.04.2020 18:31, Wolfram Sang ÐÐÑÐÑ:
>>>
>>>> Yes, that bug should be fixed anyway. But that doesn't justify breaking
>>>> suspend/resume completely, which *is* a regression.
>>>>
>>>> Look, I'm not saying that we should drop this patch altogether. All I'm
>>>> saying is that we should postpone it so that we can: a) get suspend and
>>>> resume working again (and by doing so make sure no other suspend/resume
>>>> regressions silently creep in, because that always seems to happen when
>>>> you're not looking) and b) fix any preexisting issues without possibly
>>>> scrambling the result with this perhaps unrelated fix.
>>>>
>>>> So, again, I think the safest road forward is to back this one out for
>>>> now, fix whatever this other bug is and once suspend/resume is working
>>>> properly again we can revisit this patch based on a known-good baseline.
>>>
>>> I am with you here. I want to add that the proper fix should be
>>> developed without thinking too much about stable in the first place.
>>> *When* we have a proper working fix, then we can think about making it
>>> "more" suitable for backporting. Yet, it may also be a result that older
>>> kernels need a different solution. Or have no solution at all, in case
>>> they can't do atomic_transfers and this is needed.
>>>
>>> D'accord?
>>>
>>
>> I saw that you submitted the revert of the patches for 5.7, hopefully it
>> won't result in putting the PCIe driver problem into the back burner.
>> I'll try not to forget about these patches to resubmit them later on,
>> once the problem will be resolved :)
>
> I can put these two patches into a local development branch to keep
> track of them. From what I said earlier, it looks like it would be fine
> to apply these if we also make that runtime PM change (i.e. drop force
> runtime PM and instead manually invoke runtime PM callbacks, which seems
> to be in line with what the PM maintainers suggest, as pointed out
> elsewhere in this thread).
>
> How about if I put all of that into a branch and push it to linux-next
> so that we can get some broader testing? I've already run it through our
> internal test system, which, while not perfect, is the broadest system I
> am aware of, and all tests came back positive.
Will be great.
> I'm not exactly sure I see a real issue with the PCIe driver after those
> patches are applied. The regulator errors are gone (presumably because
> the regulators now do get turned off properly) and I don't observe any
> other issues.
That's probably because this I2C patch removed the "completion done
after timeout" message. You may try to re-add the message, it should pop
up on the PCIe driver's suspension. The IRQF_NO_SUSPEND flag should fix it.
My assumption was that it should be always fine handle interrupt after
timeout, and thus, the message isn't really needed. But this wasn't a
correct assumption as we see now, so it should be better to keep the
message for the debugging purposes, maybe turn it into dev_info_once().