Re: [PATCH v4] arm64: Add workaround for Fujitsu A64FX erratum 010001

From: James Morse
Date: Mon Feb 25 2019 - 12:29:12 EST


Hi Zhang,

On 23/02/2019 13:06, Zhang, Lei wrote:
> Zhang, Lei wrote:
>> I think you mean it may be a problem to modify the KPTI trampoline because
>> some patches about KPTI will be merged to mainline in the near future.
>> I understood that.
>> I should discuss with my colleagues whether we can set NFDx=0 all of time on
>> A64FX.
>
> The result of our investigation also supports your suggestion.
> We surely agree with you that your proposed method (never set NFDx=1 on A64FX)
> is the best to resolve this erratum.
>
> For this erratum, James's patch should be merged to mainline
> instead of my previous patches (v1 to v4).
> Since KPTI fully covers the effect of NFD1 for A64FX, KPTI is
> recommended to be used in conjunction with James’s patch.

>> And thanks for your patch.
>> If we can set NFDx=0 all of time, I will review, test and report the result.
>
> I have already tested James's patch on A64FX, and the result is no problem at all.
>
> Tested-by:zhang.lei<zhang.lei@xxxxxxxxxxxxxx>

Thanks, I'll post it properly with this tag.


>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index
>> a4168d366127..b0b7f1c4e816 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -643,6 +643,25 @@ config QCOM_FALKOR_ERRATUM_E1041
>>
>> If unsure, say Y.
>>
>> +config FUJITSU_ERRATUM_010001
>> + bool "Fujitsu-A64FX erratum E#010001: Undefined fault may occur wrongly"
>> + default y
>> + help
>> + This option adds workaround for Fujitsu-A64FX erratum E#010001.
>> + On some variants of the Fujitsu-A64FX cores version (1.0, 1.1), memory
>> + accesses may cause undefined fault (Data abort, DFSC=0b111111).
>> + This fault occurs under a specific hardware condition when a
>> + load/store instruction performs an address translation using:
>> + case-1 TTBR0_EL1 with TCR_EL1.NFD0 == 1.
>> + case-2 TTBR0_EL2 with TCR_EL2.NFD0 == 1.
>> + case-3 TTBR1_EL1 with TCR_EL1.NFD1 == 1.
>> + case-4 TTBR1_EL2 with TCR_EL2.NFD1 == 1.
>> +
>> + The workaround is to ensure these bits are clear in TCR_ELx.
>> + The workaround only affect the Fujitsu-A64FX.
>
> I think it is better to add a notice here as follows:
>
> Recommend to enable KPTI (UNMAP_KERNEL_AT_EL0 = y).

That unmap option is on by default, you can't turn it off without CONFIG_EXPERT. While I
agree, I don't think we need to spell this out.


Thanks,

James