Re: [PATCH 2/2] ARM: decompressor: relax the loading restriction of the decompressed kernel

From: Geert Uytterhoeven
Date: Mon Sep 28 2020 - 08:58:13 EST


Hi Zhen,

On Mon, Sep 28, 2020 at 2:15 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> On Mon, 28 Sep 2020 at 13:57, Leizhen (ThunderTown)
> <thunder.leizhen@xxxxxxxxxx> wrote:
> > On 2020/9/28 18:14, Ard Biesheuvel wrote:
> > > On Mon, 28 Sep 2020 at 11:27, Zhen Lei <thunder.leizhen@xxxxxxxxxx> wrote:
> > >>
> > >> mov r4, pc
> > >> and r4, r4, #0xf8000000 //truncated to 128MiB boundary
> > >> add r4, r4, #TEXT_OFFSET //PA(_start)
> > >>
> > >> Currently, the decompressed kernel must be placed at the position: 128MiB
> > >> boundary + TEXT_OFFSET. This limitation is just because we masked PC with
> > >> 0xf80000000. Actually, we can directly obtain PA(_start) by using formula
> > >> : VA(_start) + (PHYS_OFFSET - PAGE_OFFSET).
> > >>
> > >> So the "PA(_start) - TEXT_OFFSET" can be 2MiB boundary, 1MiB boundary,
> > >> and so on.
> > >>
> > >> Signed-off-by: Zhen Lei <thunder.leizhen@xxxxxxxxxx>
> > >
> > > No, this won't work.
> >
> > But it works well on my board.
> >
>
> That is because you load zImage at the base of DRAM.
>
> > >
> > > The whole reason for rounding to a multiple of 128 MB is that we
> > > cannot infer the start of DRAM from the placement of the zImage (which
> > > provides _start).
> >
> > Maybe this is further guaranteed by the following code:
> > /*
> > * Set up a page table only if it won't overwrite ourself.
> > * That means r4 < pc || r4 - 16k page directory > &_end.
> > * Given that r4 > &_end is most unfrequent, we add a rough
> > * additional 1MB of room for a possible appended DTB.
> > */
> >
> > In addition, the Image has already been loaded when this position is reached.
> >
> > ----------- <--128MiB boundary
> > | |
> > ----------- <--TEXT_OFFSET <--
> > | (1)Image | |
> > ------------ |
> > | | |
> > ----------- (2)--copyto-----
> > | (2)Image |
> > -----------
> >
> > I don't think it's the case of (2), but (1). Because no code modification is
> > required for the case (2).
> >
> > By the way, I'm not familiar with the arm32 code, so I'm just speculating.
> >
>
> The zImage code that runs has not received *any* information from the
> platform on where DRAM starts, so the only info it has is the current
> placement of zImage.
>
> So when zImage is loaded at the intended base of DRAM, things work fine.
>
> If the zImage is loaded close to the end of a 128 MB region, the
> rounding would pick the start of that 128 MB region. However, the
> _start symbol you are using will point to an address that is close to
> the end of the 128 MB [given that it is inside zImage] so your logic
> will pick an address that is much higher in memory.

https://people.kernel.org/linusw/how-the-arm32-linux-kernel-decompresses
https://people.kernel.org/linusw/how-the-arm32-kernel-starts
are good reads.

Gr{oetje,eeting}s,

Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds