Re: [PATCH v12 6/7] arm64: mte: Save/Restore TFSR_EL1 during suspend

From: Lorenzo Pieralisi
Date: Tue Feb 09 2021 - 09:36:01 EST


On Tue, Feb 09, 2021 at 11:55:33AM +0000, Catalin Marinas wrote:
> On Mon, Feb 08, 2021 at 04:56:16PM +0000, Vincenzo Frascino wrote:
> > When MTE async mode is enabled TFSR_EL1 contains the accumulative
> > asynchronous tag check faults for EL1 and EL0.
> >
> > During the suspend/resume operations the firmware might perform some
> > operations that could change the state of the register resulting in
> > a spurious tag check fault report.
> >
> > Save/restore the state of the TFSR_EL1 register during the
> > suspend/resume operations to prevent this to happen.
>
> Do we need a similar fix for TFSRE0_EL1? We get away with this if
> suspend is only entered on the idle (kernel) thread but I recall we
> could also enter suspend on behalf of a user process (I may be wrong
> though).

Yes, when we suspend the machine to RAM, we execute suspend on behalf
on a userspace process (but that's only running on 1 cpu, the others
are hotplugged out).

IIUC (and that's an if) TFSRE0_EL1 is checked on kernel entry so I don't
think there is a need to save/restore it (just reset it on suspend
exit).

TFSR_EL1, I don't see a point in saving/restoring it (it is a bit
per-CPU AFAICS) either, IMO we should "check" it on suspend (if it is
possible in that context) and reset it on resume.

I don't think though you can "check" with IRQs disabled so I suspect
that TFSR_EL1 has to be saved/restored (which means that there is a
black out period where we run kernel code without being able to detect
faults but there is no solution to that other than delaying saving the
value to just before calling into PSCI). Likewise on resume from low
power.

Thanks,
Lorenzo

> If that's the case, it would make more sense to store the TFSR* regs in
> the thread_struct alongside sctlr_tcf0. If we did that, we'd not need
> the per-cpu mte_suspend_tfsr_el1 variable.
>
> --
> Catalin