Re: [PATCH] arm64/mm: Re-organise setting up FEAT_S1PIE registers PIRE0_EL1 and PIR_EL1
From: Ryan Roberts
Date: Tue Apr 15 2025 - 03:53:36 EST
On 15/04/2025 07:27, Anshuman Khandual wrote:
>
>
> On 4/14/25 18:01, Ryan Roberts wrote:
>> On 14/04/2025 13:28, Ard Biesheuvel wrote:
>>> On Mon, 14 Apr 2025 at 14:04, Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
>>>>
>>>> On 14/04/2025 10:41, Ard Biesheuvel wrote:
>>>>> On Mon, 14 Apr 2025 at 09:52, Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
>>>>>>
>>>>>> On 10/04/2025 08:40, Anshuman Khandual wrote:
>>>>>>> mov_q cannot really move PIE_E[0|1] macros into a general purpose register
>>>>>>> as expected if those macro constants contain some 128 bit layout elements,
>>>>>>> required for D128 page tables. Fix this problem via first loading up these
>>>>>>> macro constants into a given memory location and then subsequently setting
>>>>>>> up registers PIRE0_EL1 and PIR_EL1 by retrieving the memory stored values.
>>>>>>
>>>>>> From memory, the primary issue is that for D128, PIE_E[0|1] are defined in terms
>>>>>> of 128-bit types with shifting and masking, which the assembler can't do? It
>>>>>> would be good to spell this out.
>>>>>>
>>>>>>>
>>>>>>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
>>>>>>> Cc: Will Deacon <will@xxxxxxxxxx>
>>>>>>> Cc: Mark Rutland <mark.rutland@xxxxxxx>
>>>>>>> Cc: Ard Biesheuvel <ardb@xxxxxxxxxx>
>>>>>>> Cc: Ryan Roberts <ryan.roberts@xxxxxxx>
>>>>>>> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>>>>>>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>>>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx>
>>>>>>> ---
>>>>>>> This patch applies on v6.15-rc1
>>>>>>>
>>>>>>> arch/arm64/kernel/head.S | 3 +++
>>>>>>> arch/arm64/kernel/pi/map_range.c | 6 ++++++
>>>>>>> arch/arm64/kernel/pi/pi.h | 1 +
>>>>>>> arch/arm64/mm/mmu.c | 1 +
>>>>>>> arch/arm64/mm/proc.S | 5 +++--
>>>>>>> 5 files changed, 14 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>>>>>>> index 2ce73525de2c..4950d9cc638a 100644
>>>>>>> --- a/arch/arm64/kernel/head.S
>>>>>>> +++ b/arch/arm64/kernel/head.S
>>>>>>> @@ -126,6 +126,9 @@ SYM_CODE_START(primary_entry)
>>>>>>> * On return, the CPU will be ready for the MMU to be turned on and
>>>>>>> * the TCR will have been set.
>>>>>>> */
>>>>>>> + adr_l x0, pir_data
>>>>>>> + bl __pi_load_pir_data
>>>>>>
>>>>>> Using C code to pre-calculate the values into global variables that the assembly
>>>>>> code then loads and stuffs into the PIR registers feels hacky. I wonder if we
>>>>>> can instead pre-calculate into asm-offsets.h? e.g. add the following to
>>>>>> asm-offsets.c:
>>>>>>
>>>>>> DEFINE(PIE_E0_ASM, PIE_E0);
>>>>>> DEFINE(PIE_E1_ASM, PIE_E1);
>>>>>>
>>>>>> Which will generate the asm-offsets.h header with PIE_E[0|1]_ASM with the
>>>>>> pre-calculated values that you can then use in proc.S?
>>>>>>
>>>>>
>>>>> There is another issue, which is that mov_q tries to be smart, and
>>>>> emit fewer than 4 MOVZ/MOVK instructions if possible. So the .if
>>>>> directive evaluates the argument, which does not work with symbolic
>>>>> constants.
>>>>
>>>> I'm not quite understanding the detail here; what do you mean by "symbolic
>>>> constants"? asm-offsets.h will provide something like:
>>>>
>>>> #define PIE_E0_ASM 1234567890
>>>>
>>>> The current code is using a hash-define and that's working fine:
>>>>
>>>> mov_q x0, PIE_E0
>>>>
>>>>
>>>> Won't the C preprocessor just substitute and everything will work out?
>>>>
>>>
>>> Yeah, you're right. I was experimenting with something like
>>>
>>> .set .Lpie_e0, PIE_E0_ASM
>>> mov_q xN, .Lpie_e0
>>>
>>> where this problem does exist, but we can just use PIE_E0_ASM directly
>>> and things should work as expected.
>>
>> Ahh great, sounds like this should be pretty simple then!
>
> Following change works both on current and with D128 page tables.
>
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -182,5 +182,7 @@ int main(void)
> #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> DEFINE(FTRACE_OPS_DIRECT_CALL, offsetof(struct ftrace_ops, direct_call));
> #endif
> + DEFINE(PIE_E0_ASM, PIE_E0);
> + DEFINE(PIE_E1_ASM, PIE_E1);
> return 0;
> }
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 737c99d79833..f45494425d09 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -536,9 +536,9 @@ alternative_else_nop_endif
> #define PTE_MAYBE_NG 0
> #define PTE_MAYBE_SHARED 0
I think at minimum, you can remove this PTE_MAYBE_* hack from proc.S. But as Ard
says, you may need to add it to asm-offsets.c? I'm surprised asm-offsets.c even
compiles without this hack since surely it doesn't have arm64_use_ng_mappings or
is_realm_world() available?
Thanks,
Ryan
>
> - mov_q x0, PIE_E0
> + mov_q x0, PIE_E0_ASM
> msr REG_PIRE0_EL1, x0
> - mov_q x0, PIE_E1
> + mov_q x0, PIE_E1_ASM
> msr REG_PIR_EL1, x0
>
> #undef PTE_MAYBE_NG
>