RE: [PATCH v4] arm64: Add workaround for Fujitsu A64FX erratum 010001

From: Zhang, Lei
Date: Sat Feb 23 2019 - 08:06:57 EST


Hi guys,

> -----Original Message-----
> From: linux-arm-kernel <linux-arm-kernel-bounces@xxxxxxxxxxxxxxxxxxx> On
> Behalf Of Zhang, Lei
> Sent: Friday, February 15, 2019 9:36 PM
> To: 'James Morse' <james.morse@xxxxxxx>; Mark Rutland
> <mark.rutland@xxxxxxx>
> Cc: 'Catalin Marinas' <catalin.marinas@xxxxxxx>; 'Will Deacon'
> <will.deacon@xxxxxxx>; 'linux-kernel@xxxxxxxxxxxxxxx'
> <linux-kernel@xxxxxxxxxxxxxxx>; 'linux-arm-kernel@xxxxxxxxxxxxxxxxxxx'
> <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx>
> Subject: RE: [PATCH v4] arm64: Add workaround for Fujitsu A64FX erratum
> 010001
>
>
> I think you mean it may be a problem to modify the KPTI trampoline because
> some patches about KPTI will be merged to mainline in the near future.
> I understood that.
> I should discuss with my colleagues whether we can set NFDx=0 all of time on
> A64FX.

The result of our investigation also supports your suggestion.
We surely agree with you that your proposed method (never set NFDx=1 on A64FX)
is the best to resolve this erratum.

For this erratum, James's patch should be merged to mainline
instead of my previous patches (v1 to v4).
Since KPTI fully covers the effect of NFD1 for A64FX, KPTI is
recommended to be used in conjunction with James’s patch.

> And thanks for your patch.
> If we can set NFDx=0 all of time, I will review, test and report the result.

I have already tested James's patch on A64FX, and the result is no problem at all.

Tested-by:zhang.lei<zhang.lei@xxxxxxxxxxxxxx>


> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index
> a4168d366127..b0b7f1c4e816 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -643,6 +643,25 @@ config QCOM_FALKOR_ERRATUM_E1041
>
> If unsure, say Y.
>
> +config FUJITSU_ERRATUM_010001
> + bool "Fujitsu-A64FX erratum E#010001: Undefined fault may occur wrongly"
> + default y
> + help
> + This option adds workaround for Fujitsu-A64FX erratum E#010001.
> + On some variants of the Fujitsu-A64FX cores version (1.0, 1.1), memory
> + accesses may cause undefined fault (Data abort, DFSC=0b111111).
> + This fault occurs under a specific hardware condition when a
> + load/store instruction performs an address translation using:
> + case-1 TTBR0_EL1 with TCR_EL1.NFD0 == 1.
> + case-2 TTBR0_EL2 with TCR_EL2.NFD0 == 1.
> + case-3 TTBR1_EL1 with TCR_EL1.NFD1 == 1.
> + case-4 TTBR1_EL2 with TCR_EL2.NFD1 == 1.
> +
> + The workaround is to ensure these bits are clear in TCR_ELx.
> + The workaround only affect the Fujitsu-A64FX.

I think it is better to add a notice here as follows:

Recommend to enable KPTI (UNMAP_KERNEL_AT_EL0 = y).

> +
> + If unsure, say Y.
> +
> endmenu

Thanks a lot.

Best Regards,
Zhang Lei