Re: [PATCH] arm64/mm: Drop ESR_ELx_FSC_TYPE

From: Marc Zyngier
Date: Fri Jun 14 2024 - 06:38:00 EST


On Fri, 14 Jun 2024 03:24:53 +0100,
Anshuman Khandual <anshuman.khandual@xxxxxxx> wrote:
> On 6/13/24 16:53, Marc Zyngier wrote:
> > On Thu, 13 Jun 2024 10:45:38 +0100,
> > Anshuman Khandual <anshuman.khandual@xxxxxxx> wrote:
> >>
> >> Fault status codes at page table level 0, 1, 2 and 3 for access, permission
> >> and translation faults are architecturally organized in a way, that masking
> >> out ESR_ELx_FSC_TYPE, fetches Level 0 status code for the respective fault.
> >>
> >> Helpers like esr_fsc_is_[translation|permission|access_flag]_fault() mask
> >> out ESR_ELx_FSC_TYPE before comparing against corresponding Level 0 status
> >> code as the kernel does not yet care about the page table level, the fault
> >> really occurred previously.
> >>
> >> This scheme is starting to crumble after FEAT_LPA2 when level -1 got added.
> >> Fault status code for translation fault at level -1 is 0x2B which does not
> >> follow ESR_ELx_FSC_TYPE, requiring esr_fsc_is_translation_fault() changes.
> >>
> >> This changes above helpers to compare against individual fault status code
> >> values for each page table level and drop ESR_ELx_FSC_TYPE which is losing
> >> its value as a common mask.
> >
> > I'd rather we do not drop the existing #defines, for a very
> > self-serving reason:
> >
> > NV requires an implementation to synthesise fault syndromes, and these
> > definition are extensively used to compose the syndrome information
> > (see the NV MMU series at [1]). This is also heavily use to emulate
> > the AT instructions (fault reporting in PAR_EL1.FST).
> >
> > Having additional helpers is fine. Dropping the base definitions
> > isn't, and I'd like to avoid reintroducing them.
>
> You would like to just leave behind all the existing level 0 syndrome macro
> definitions in place ?

They are not level 0. They are values for the type of the fault. They
are *abused* as level 0, but that's not what they are here for.

>
> #define ESR_ELx_FSC_ACCESS (0x08)
> #define ESR_ELx_FSC_FAULT (0x04)
> #define ESR_ELx_FSC_PERM (0x0C)

+ ESR_ELx_FSC_{TYPE,LEVEL}, because they are convenient macros to
extract the type/level of a fault. NV further adds ESR_ELx_FSC_ADDRSZ
which has been missing.

>
> Or which are rather
>
> #define ESR_ELx_FSC_ACCESS ESR_ELx_FSC_ACCESS_L0
> #define ESR_ELx_FSC_FAULT ESR_ELx_FSC_FAULT_L0
> #define ESR_ELx_FSC_PERM ESR_ELx_FSC_PERM_L0

I definitely prefer the former.

> But just wondering why cannot ESR_ELx_FSC_[ACCESS|FAULT|PERM]_L0 definitions
> be used directly in new use cases ?

Because that is semantically wrong to add/or a level on something that
*already* describes a level. Specially for the level -1 case.

On top of that, what I dislike the most about this patch is that it
defines discrete values for something that could be parametric at zero
cost, just like ESR_ELx_FSC_SEA_TTW(). Yes, there is some additional
complexity, but nothing that the compiler can't elide.

For example, something like this:

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 7abf09df7033..c320aeb1bb9a 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -121,6 +121,10 @@
#define ESR_ELx_FSC_SECC (0x18)
#define ESR_ELx_FSC_SECC_TTW(n) (0x1c + (n))

+#define ESR_ELx_FSC_FAULT_nL (0x2C)
+#define ESR_ELx_FSC_FAULT_L(n) (((n) < 0 ? ESR_ELx_FSC_FAULT_nL : \
+ ESR_ELx_FSC_FAULT) + (n))
+
/* ISS field definitions for Data Aborts */
#define ESR_ELx_ISV_SHIFT (24)
#define ESR_ELx_ISV (UL(1) << ESR_ELx_ISV_SHIFT)

Importantly, it avoids the ESR_ELx_FSC_FAULT_LN1 horror, and allows
ESR_ELx_FSC_FAULT_L(-1) to be written.

M.

--
Without deviation from the norm, progress is not possible.