Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary
From: Leizhen (ThunderTown)
Date: Tue Sep 22 2020 - 08:30:39 EST
On 2020/9/21 16:53, Leizhen (ThunderTown) wrote:
>
>
> On 2020/9/21 14:47, Ard Biesheuvel wrote:
>> On Mon, 21 Sep 2020 at 05:35, Leizhen (ThunderTown)
>> <thunder.leizhen@xxxxxxxxxx> wrote:
>>>
>>>
>>>
>>> On 2020/9/17 22:00, Ard Biesheuvel wrote:
>>>> On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin
>>>> <linux@xxxxxxxxxxxxxxx> wrote:
>>>>>
>>>>> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote:
>>>>>> Currently, only support the kernels where the base of physical memory is
>>>>>> at a 16MiB boundary. Because the add/sub instructions only contains 8bits
>>>>>> unrotated value. But we can use one more "add/sub" instructions to handle
>>>>>> bits 23-16. The performance will be slightly affected.
>>>>>>
>>>>>> Since most boards meet 16 MiB alignment, so add a new configuration
>>>>>> option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if
>>>>>> anyone really needs it.
>>>>>>
>>>>>> All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are
>>>>>> used in __fixup_a_pv_table() now, but the callee saved r11 is not used in
>>>>>> the whole head.S file. So choose it.
>>>>>>
>>>>>> Because the calculation of "y = x + __pv_offset[63:24]" have been done,
>>>>>> so we only need to calculate "y = y + __pv_offset[23:16]", that's why
>>>>>> the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub()
>>>>>> in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t"
>>>>>> (above y).
>>>>>>
>>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@xxxxxxxxxx>
>>>>>> ---
>>>>>> arch/arm/Kconfig | 18 +++++++++++++++++-
>>>>>> arch/arm/include/asm/memory.h | 16 +++++++++++++---
>>>>>> arch/arm/kernel/head.S | 25 +++++++++++++++++++------
>>>>>> 3 files changed, 49 insertions(+), 10 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>>>>>> index e00d94b16658765..19fc2c746e2ce29 100644
>>>>>> --- a/arch/arm/Kconfig
>>>>>> +++ b/arch/arm/Kconfig
>>>>>> @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT
>>>>>> kernel in system memory.
>>>>>>
>>>>>> This can only be used with non-XIP MMU kernels where the base
>>>>>> - of physical memory is at a 16MB boundary.
>>>>>> + of physical memory is at a 16MiB boundary.
>>>>>>
>>>>>> Only disable this option if you know that you do not require
>>>>>> this feature (eg, building a kernel for a single machine) and
>>>>>> you need to shrink the kernel to the minimal size.
>>>>>>
>>>>>> +config ARM_PATCH_PHYS_VIRT_RADICAL
>>>>>> + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary"
>>>>>> + default n
>>>>>
>>>>> Please drop the "default n" - this is the default anyway.
>>>>>
>>>>>> @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
>>>>>> * in place where 'r' 32 bit operand is expected.
>>>>>> */
>>>>>> __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24);
>>>>>> +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL
>>>>>> + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16);
>>>>>
>>>>> t is already unsigned long, so this cast is not necessary.
>>>>>
>>>>> I've been debating whether it would be better to use "movw" for this
>>>>> for ARMv7. In other words:
>>>>>
>>>>> movw tmp, #16-bit
>>>>> adds %Q0, %1, tmp, lsl #16
>>>>> adc %R0, %R0, #0
>>>>>
>>>>> It would certainly be less instructions, but at the cost of an
>>>>> additional register - and we'd have to change the fixup code to
>>>>> know about movw.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>
>>>> Since LPAE implies v7, we can use movw unconditionally, which is nice.
>>>>
>>>> There is no need to use an additional temp register, as we can use the
>>>> register holding the high word. (There is no need for the mov_hi macro
>>>> to be separate)
>>>>
>>>> 0: movw %R0, #low offset >> 16
>>>> adds %Q0, %1, %R0, lsl #16
>>>> 1: mov %R0, #high offset
>>>> adc %R0, %R0, #0
>>>> .pushsection .pv_table,"a"
>>>> .long 0b, 1b
>>>> .popsection
>>>>
>>>> The only problem is distinguishing the two mov instructions from each
>>>
>>> The #high offset can also consider use movw, it just save two bytes in
>>> the thumb2 scenario. We can store different imm16 value for high_offset
>>> and low_offset, so that we can distinguish them in __fixup_a_pv_table().
>>>
>>> This will make the final implementation of the code look more clear and
>>> consistent, especially THUMB2.
>>>
>>> Let me try it.
>>>
>>
>> Hello Zhen Lei,
>>
>> I am looking into this as well:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-p2v-v2
>>
>> Could you please test this version on your hardware?
>
> OK, I will test it on my boards.
Hi Ard Biesheuvel:
I have tested it on 16MiB aligned + LE board, it works well. I've asked my colleagues
from other departments to run it on 2MiB aligned + BE board. He will do it tomorrow.
>
>>
>> .
>>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> .
>