Re: [PATCH v2 2/2] ARM: support PHYS_OFFSET minimum aligned at 64KiB boundary

From: Ard Biesheuvel
Date: Thu Sep 17 2020 - 10:10:43 EST


On Tue, 15 Sep 2020 at 22:06, Russell King - ARM Linux admin
<linux@xxxxxxxxxxxxxxx> wrote:
>
> On Tue, Sep 15, 2020 at 09:16:15PM +0800, Zhen Lei wrote:
> > Currently, only support the kernels where the base of physical memory is
> > at a 16MiB boundary. Because the add/sub instructions only contains 8bits
> > unrotated value. But we can use one more "add/sub" instructions to handle
> > bits 23-16. The performance will be slightly affected.
> >
> > Since most boards meet 16 MiB alignment, so add a new configuration
> > option ARM_PATCH_PHYS_VIRT_RADICAL (default n) to control it. Say Y if
> > anyone really needs it.
> >
> > All r0-r7 (r1 = machine no, r2 = atags or dtb, in the start-up phase) are
> > used in __fixup_a_pv_table() now, but the callee saved r11 is not used in
> > the whole head.S file. So choose it.
> >
> > Because the calculation of "y = x + __pv_offset[63:24]" have been done,
> > so we only need to calculate "y = y + __pv_offset[23:16]", that's why
> > the parameters "to" and "from" of __pv_stub() and __pv_add_carry_stub()
> > in the scope of CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL are all passed "t"
> > (above y).
> >
> > Signed-off-by: Zhen Lei <thunder.leizhen@xxxxxxxxxx>
> > ---
> > arch/arm/Kconfig | 18 +++++++++++++++++-
> > arch/arm/include/asm/memory.h | 16 +++++++++++++---
> > arch/arm/kernel/head.S | 25 +++++++++++++++++++------
> > 3 files changed, 49 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index e00d94b16658765..19fc2c746e2ce29 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -240,12 +240,28 @@ config ARM_PATCH_PHYS_VIRT
> > kernel in system memory.
> >
> > This can only be used with non-XIP MMU kernels where the base
> > - of physical memory is at a 16MB boundary.
> > + of physical memory is at a 16MiB boundary.
> >
> > Only disable this option if you know that you do not require
> > this feature (eg, building a kernel for a single machine) and
> > you need to shrink the kernel to the minimal size.
> >
> > +config ARM_PATCH_PHYS_VIRT_RADICAL
> > + bool "Support PHYS_OFFSET minimum aligned at 64KiB boundary"
> > + default n
>
> Please drop the "default n" - this is the default anyway.
>
> > @@ -236,6 +243,9 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
> > * in place where 'r' 32 bit operand is expected.
> > */
> > __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24);
> > +#ifdef CONFIG_ARM_PATCH_PHYS_VIRT_RADICAL
> > + __pv_stub((unsigned long) t, t, "sub", __PV_BITS_23_16);
>
> t is already unsigned long, so this cast is not necessary.
>
> I've been debating whether it would be better to use "movw" for this
> for ARMv7. In other words:
>
> movw tmp, #16-bit
> adds %Q0, %1, tmp, lsl #16
> adc %R0, %R0, #0
>
> It would certainly be less instructions, but at the cost of an
> additional register - and we'd have to change the fixup code to
> know about movw.
>
> Thoughts?
>

Since LPAE implies v7, we can use movw unconditionally, which is nice.

There is no need to use an additional temp register, as we can use the
register holding the high word. (There is no need for the mov_hi macro
to be separate)

0: movw %R0, #low offset >> 16
adds %Q0, %1, %R0, lsl #16
1: mov %R0, #high offset
adc %R0, %R0, #0
.pushsection .pv_table,"a"
.long 0b, 1b
.popsection

The only problem is distinguishing the two mov instructions from each
other, but that should not be too hard I think.