Re: [PATCH v2 1/2] i2c: tegra: Better handle case where CPU0 is busy for a long time
From: Jon Hunter
Date: Tue Apr 28 2020 - 04:02:09 EST
On 27/04/2020 16:18, Dmitry Osipenko wrote:
> 27.04.2020 18:12, Thierry Reding ÐÐÑÐÑ:
>> On Mon, Apr 27, 2020 at 05:21:30PM +0300, Dmitry Osipenko wrote:
>>> 27.04.2020 14:00, Thierry Reding ÐÐÑÐÑ:
>>>> On Mon, Apr 27, 2020 at 12:52:10PM +0300, Dmitry Osipenko wrote:
>>>>> 27.04.2020 10:48, Thierry Reding ÐÐÑÐÑ:
>>>>> ...
>>>>>>> Maybe but all these other problems appear to have existed for sometime
>>>>>>> now. We need to fix all, but for the moment we need to figure out what's
>>>>>>> best for v5.7.
>>>>>>
>>>>>> To me it doesn't sound like we have a good handle on what exactly is
>>>>>> going on here and we're mostly just poking around.
>>>>>>
>>>>>> And even if things weren't working quite properly before, it sounds to
>>>>>> me like this patch actually made things worse.
>>>>>
>>>>> There is a plenty of time to work on the proper fix now. To me it sounds
>>>>> like you're giving up on fixing the root of the problem, sorry.
>>>>
>>>> We're at -rc3 now and I haven't seen any promising progress in the last
>>>> week. All the while suspend/resume is now broken on at least one board
>>>> and that may end up hiding any other issues that could creep in in the
>>>> meantime.
>>>>
>>>> Furthermore we seem to have a preexisting issue that may very well
>>>> interfere with this patch, so I think the cautious thing is to revert
>>>> for now and then fix the original issue first. We can always come back
>>>> to this once everything is back to normal.
>>>>
>>>> Also, people are now looking at backporting this to v5.6. Unless we
>>>> revert this from v5.7 it may get picked up for backports to other
>>>> kernels and then I have to notify stable kernel maintainers that they
>>>> shouldn't and they have to back things out again. That's going to cause
>>>> a lot of wasted time for a lot of people.
>>>>
>>>> So, sorry, I disagree. I don't think we have "plenty of time".
>>>
>>> There is about a month now before the 5.7 release. It's a bit too early
>>> to start the panic, IMO :)
>>
>> There's no panic. A patch got merged and it broken something, so we
>> revert it and try again. It's very much standard procedure.
>>
>>> Jon already proposed a reasonable simple solution: to keep PCIe
>>> regulators always-ON. In a longer run we may want to have I2C atomic
>>> transfers supported for a late suspend phase.
>>
>> That's not really a solution, though, is it? It's just papering over
>> an issue that this patch introduced or uncovered. I'm much more in
>> favour of fixing problems at the root rather than keep papering over
>> until we loose track of what the actual problems are.
>
> It's not "papering over an issue". The bug can't be fixed properly
> without introducing I2C atomic transfers support for a late suspend
> phase, I don't see any other solutions for now. Stable kernels do not
> support atomic transfers at all, that proper solution won't be backportable.
There are a few issues here, but the issue Thierry and I are referring
to is the regression introduced by this change. Yes this exposes other
problems, but we first need to understand why this breaks resume in
general, regardless of what the PCIe driver is doing. I will look at
this a bit more later this week.
Cheers
Jon
--
nvpublic