Re: [PATCH v13 6/7] arm64: mte: Report async tag faults before suspend
From: Lorenzo Pieralisi
Date: Fri Feb 12 2021 - 07:03:36 EST
On Thu, Feb 11, 2021 at 03:33:52PM +0000, Vincenzo Frascino wrote:
> When MTE async mode is enabled TFSR_EL1 contains the accumulative
> asynchronous tag check faults for EL1 and EL0.
>
> During the suspend/resume operations the firmware might perform some
> operations that could change the state of the register resulting in
> a spurious tag check fault report.
>
> Report asynchronous tag faults before suspend and clear the TFSR_EL1
> register after resume to prevent this to happen.
>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>
> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@xxxxxxx>
> ---
> arch/arm64/include/asm/mte.h | 4 ++++
> arch/arm64/kernel/mte.c | 20 ++++++++++++++++++++
> arch/arm64/kernel/suspend.c | 3 +++
> 3 files changed, 27 insertions(+)
>
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 43169b978cd3..33e88a470357 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -41,6 +41,7 @@ void mte_sync_tags(pte_t *ptep, pte_t pte);
> void mte_copy_page_tags(void *kto, const void *kfrom);
> void flush_mte_state(void);
> void mte_thread_switch(struct task_struct *next);
> +void mte_suspend_enter(void);
> void mte_suspend_exit(void);
> long set_mte_ctrl(struct task_struct *task, unsigned long arg);
> long get_mte_ctrl(struct task_struct *task);
> @@ -66,6 +67,9 @@ static inline void flush_mte_state(void)
> static inline void mte_thread_switch(struct task_struct *next)
> {
> }
> +static inline void mte_suspend_enter(void)
> +{
> +}
> static inline void mte_suspend_exit(void)
> {
> }
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index f5aa5bea6dfe..de905102245a 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -258,12 +258,32 @@ void mte_thread_switch(struct task_struct *next)
> mte_check_tfsr_el1();
> }
>
> +void mte_suspend_enter(void)
> +{
> + if (!system_supports_mte())
> + return;
> +
> + /*
> + * The barriers are required to guarantee that the indirect writes
> + * to TFSR_EL1 are synchronized before we report the state.
> + */
> + dsb(nsh);
> + isb();
> +
> + /* Report SYS_TFSR_EL1 before suspend entry */
> + mte_check_tfsr_el1();
> +}
> +
> void mte_suspend_exit(void)
> {
> if (!system_supports_mte())
> return;
>
> update_gcr_el1_excl(gcr_kernel_excl);
> +
> + /* Clear SYS_TFSR_EL1 after suspend exit */
> + write_sysreg_s(0, SYS_TFSR_EL1);
AFAICS it is not needed, it is done already in __cpu_setup() (that is
called by cpu_resume on return from cpu_suspend() from firmware).
However, I have a question. We are relying on context switch to set
sctlr_el1_tfc0 right ? If that's the case, till the thread resuming from
low power switches context we are running with SCTLR_EL1_TCF0 not
reflecting the actual value.
Just making sure that I understand it correctly, I need to check the
resume from suspend-to-RAM path, it is something that came up with perf
save/restore already in the past.
Lorenzo
> +
> }
>
> long set_mte_ctrl(struct task_struct *task, unsigned long arg)
> diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
> index a67b37a7a47e..25a02926ad88 100644
> --- a/arch/arm64/kernel/suspend.c
> +++ b/arch/arm64/kernel/suspend.c
> @@ -91,6 +91,9 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
> unsigned long flags;
> struct sleep_stack_data state;
>
> + /* Report any MTE async fault before going to suspend */
> + mte_suspend_enter();
> +
> /*
> * From this point debug exceptions are disabled to prevent
> * updates to mdscr register (saved and restored along with
> --
> 2.30.0
>