Re: [linux-sunxi] Re: [PATCH 0/2] Allwinner A64 timer workaround

From: Marc Zyngier
Date: Wed Jul 04 2018 - 11:30:49 EST


On 04/07/18 16:15, Andre Przywara wrote:
> Hi,
>
> On 04/07/18 16:01, Marc Zyngier wrote:
>> On Wed, 04 Jul 2018 15:44:36 +0100,
>> Andre Przywara <andre.przywara@xxxxxxx> wrote:
>>>
>>> Hi,
>>>
>>> On 04/07/18 15:31, Thomas Gleixner wrote:
>>>> On Wed, 4 Jul 2018, Andre Przywara wrote:
>>>>> On 04/07/18 11:00, Thomas Gleixner wrote:
>>>>>> On Wed, 4 Jul 2018, Marc Zyngier wrote:
>>>>>>> On 04/07/18 09:23, Daniel Lezcano wrote:
>>>>>>>>
>>>>>>>> If the patches fix a bug which already exist, it makes sense to
>>>>>>>> propagated the fix back to the stable versions.
>>>>>>>
>>>>>>> That's your call, but I'm not supportive of that decision, specially as
>>>>>>> we have information from the person developing the workaround that this
>>>>>>> doesn't fully address the issue.
>>>>>>
>>>>>> The patches should not be applied at all. Simply because they don't fix the
>>>>>> issue completely.
>>>>>>
>>>>>> From a quick glance at various links and information about this, this very
>>>>>> much smells like the FSL_ERRATUM_A008585.
>>>>>> Has that been tried? It looks way more robust than the magic 11 bit
>>>>>> crystal ball logic.
>>>>>
>>>>> The Freescale erratum is similar, but not identical [1].
>>>>> It seems like the A64 is less variable, so we can use a cheaper
>>>>> workaround, which gets away with normally just one sysreg read. But then
>>>>> again the newer error reports may actually suggest otherwise ...
>>>>>
>>>>> And as it currently stands, the Freescale erratum has the drawback of
>>>>> relying on the CPU running much faster than the timer. The A64 can run
>>>>> at 24 MHz (for power savings, or possibly during DVFS transitions),
>>>>> which is the timer frequency. So subsequent counter reads will never
>>>>> return the same value and the workaround times out.
>>>>
>>>> If that's the case then you need to find a different functional timer for
>>>> time keeping. Having an erratic behaving timer for time keeping is not an
>>>> option at all.
>>>
>>> That's not an option on arm64. There are other usable time sources in
>>> the SoC, but the arch timer is somewhat mandatory for all practical
>>> purposes on arm64. We rely on it in some many places that it's not
>>> feasible to run without it. That's why we call it "architected" timer
>>> after all ;-)
>>> But I am quite confident that we can find a correct workaround. Maybe
>>> it's really the TVAL (the downcounter) write which is the culprit here,
>>> since the hardware actually writes "now() + TVAL" into the CVAL
>>> (upcounter) register. This internal counter access may be flawed as well.
>>
>> You got it backward: CVAL is not a counter at all. It is a
>> Comparator. And TVAL has an implicit read from the counter, as it is
>> defined as "CVAL - CNT" (i.e. the number of ticks until the timer
>> expires).
>
> Yes, that's what I meant actually, sorry for the lousy wording.
>
> What I am actually more concerned about than reading (do we actually
> read TVAL?), is writing TVAL. The original BSP errata hack hints at this
> being a problem:
> https://github.com/longsleep/linux-pine64/blob/5b10a45ae8b0/drivers/clocksource/arm_arch_timer.c#L231-L244

Right, and they only address the comparator, ignoring the counter.
Braindead. I specially enjoy the "we should try to fix this" comment.

>> So it might be worth trying to handle TVAL entirely in SW.

Given the above, I think the above makes sense:

- write TVAL: read CNT until stable, add the delta, write CVAL instead
- read TVAL: read CNT until stable, substract CVAL, return the delta

The low frequency problem remains. If it can't be solved, drop the arch
timer from the DT (it is dead), and use a separate timer/counter. Simply
not fit for purpose.

M.
--
Jazz is not dead. It just smells funny...