Re: [PATCH 10/10] arm64: errata: Add workaround for TSB flush failures

From: Anshuman Khandual
Date: Mon Aug 02 2021 - 05:11:28 EST




On 7/29/21 4:11 PM, Suzuki K Poulose wrote:
> On 29/07/2021 10:55, Marc Zyngier wrote:
>> On Wed, 28 Jul 2021 14:52:17 +0100,
>> Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote:
>>>
>>> Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers
>>> from errata, where a TSB (trace synchronization barrier)
>>> fails to flush the trace data completely, when executed from
>>> a trace prohibited region. In Linux we always execute it
>>> after we have moved the PE to trace prohibited region. So,
>>> we can apply the workaround everytime a TSB is executed.
>>>
>>> The work around is to issue two TSB consecutively.
>>>
>>> NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying
>>> that a late CPU could be blocked from booting if it is the
>>> first CPU that requires the workaround. This is because we
>>> do not allow setting a cpu_hwcaps after the SMP boot. The
>>> other alternative is to use "this_cpu_has_cap()" instead
>>> of the faster system wide check, which may be a bit of an
>>> overhead, given we may have to do this in nvhe KVM host
>>> before a guest entry.
>>>
>>> Cc: Will Deacon <will@xxxxxxxxxx>
>>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
>>> Cc: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
>>> Cc: Mike Leach <mike.leach@xxxxxxxxxx>
>>> Cc: Mark Rutland <mark.rutland@xxxxxxx>
>>> Cc: Anshuman Khandual <anshuman.khandual@xxxxxxx>
>>> Cc: Marc Zyngier <maz@xxxxxxxxxx>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
>>> ---
>>>   Documentation/arm64/silicon-errata.rst |  4 ++++
>>>   arch/arm64/Kconfig                     | 31 ++++++++++++++++++++++++++
>>>   arch/arm64/include/asm/barrier.h       | 17 +++++++++++++-
>>>   arch/arm64/kernel/cpu_errata.c         | 19 ++++++++++++++++
>>>   arch/arm64/tools/cpucaps               |  1 +
>>>   5 files changed, 71 insertions(+), 1 deletion(-)
>>
>> [...]
>>
>>> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
>>> index 451e11e5fd23..3bc1ed436e04 100644
>>> --- a/arch/arm64/include/asm/barrier.h
>>> +++ b/arch/arm64/include/asm/barrier.h
>>> @@ -23,7 +23,7 @@
>>>   #define dsb(opt)    asm volatile("dsb " #opt : : : "memory")
>>>     #define psb_csync()    asm volatile("hint #17" : : : "memory")
>>> -#define tsb_csync()    asm volatile("hint #18" : : : "memory")
>>> +#define __tsb_csync()    asm volatile("hint #18" : : : "memory")
>>>   #define csdb()        asm volatile("hint #20" : : : "memory")
>>>     #ifdef CONFIG_ARM64_PSEUDO_NMI
>>> @@ -46,6 +46,21 @@
>>>   #define dma_rmb()    dmb(oshld)
>>>   #define dma_wmb()    dmb(oshst)
>>>   +
>>> +#define tsb_csync()                                \
>>> +    do {                                    \
>>> +        /*                                \
>>> +         * CPUs affected by Arm Erratum 2054223 or 2067961 needs    \
>>> +         * another TSB to ensure the trace is flushed.            \
>>> +         */                                \
>>> +        if (cpus_have_const_cap(ARM64_WORKAROUND_TSB_FLUSH_FAILURE)) {    \
>>
>> Could this be made a final cap instead? Or do you expect this to be
>> usable before caps have been finalised?
>
> Good point. This can be final cap.
>
>>
>>> +            __tsb_csync();                        \
>>> +            __tsb_csync();                        \
>>> +        } else {                            \
>>> +            __tsb_csync();                        \
>>> +        }                                \
>>
>> nit: You could keep one unconditional __tsb_csync().
>
> I thought about that, I was worried if the CPU expects them back to back
> without any other instructions in between them. Thinking about it a bit
> more, it doesn't look like that is the case. I will confirm this and
> change it accordingly.
But its a very subtle change which might be difficult to debug and blame
later on, if indeed both the instructions need to be back to back. Seems
like just better to leave this unchanged.