Re: [PATCH 2/2] ARM: decompressor: relax the loading restriction of the decompressed kernel

From: Leizhen (ThunderTown)
Date: Mon Sep 28 2020 - 22:54:06 EST




On 2020/9/28 20:57, Geert Uytterhoeven wrote:
> Hi Zhen,
>
> On Mon, Sep 28, 2020 at 2:15 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>> On Mon, 28 Sep 2020 at 13:57, Leizhen (ThunderTown)
>> <thunder.leizhen@xxxxxxxxxx> wrote:
>>> On 2020/9/28 18:14, Ard Biesheuvel wrote:
>>>> On Mon, 28 Sep 2020 at 11:27, Zhen Lei <thunder.leizhen@xxxxxxxxxx> wrote:
>>>>>
>>>>> mov r4, pc
>>>>> and r4, r4, #0xf8000000 //truncated to 128MiB boundary
>>>>> add r4, r4, #TEXT_OFFSET //PA(_start)
>>>>>
>>>>> Currently, the decompressed kernel must be placed at the position: 128MiB
>>>>> boundary + TEXT_OFFSET. This limitation is just because we masked PC with
>>>>> 0xf80000000. Actually, we can directly obtain PA(_start) by using formula
>>>>> : VA(_start) + (PHYS_OFFSET - PAGE_OFFSET).
>>>>>
>>>>> So the "PA(_start) - TEXT_OFFSET" can be 2MiB boundary, 1MiB boundary,
>>>>> and so on.
>>>>>
>>>>> Signed-off-by: Zhen Lei <thunder.leizhen@xxxxxxxxxx>
>>>>
>>>> No, this won't work.
>>>
>>> But it works well on my board.
>>>
>>
>> That is because you load zImage at the base of DRAM.
>>
>>>>
>>>> The whole reason for rounding to a multiple of 128 MB is that we
>>>> cannot infer the start of DRAM from the placement of the zImage (which
>>>> provides _start).
>>>
>>> Maybe this is further guaranteed by the following code:
>>> /*
>>> * Set up a page table only if it won't overwrite ourself.
>>> * That means r4 < pc || r4 - 16k page directory > &_end.
>>> * Given that r4 > &_end is most unfrequent, we add a rough
>>> * additional 1MB of room for a possible appended DTB.
>>> */
>>>
>>> In addition, the Image has already been loaded when this position is reached.
>>>
>>> ----------- <--128MiB boundary
>>> | |
>>> ----------- <--TEXT_OFFSET <--
>>> | (1)Image | |
>>> ------------ |
>>> | | |
>>> ----------- (2)--copyto-----
>>> | (2)Image |
>>> -----------
>>>
>>> I don't think it's the case of (2), but (1). Because no code modification is
>>> required for the case (2).
>>>
>>> By the way, I'm not familiar with the arm32 code, so I'm just speculating.
>>>
>>
>> The zImage code that runs has not received *any* information from the
>> platform on where DRAM starts, so the only info it has is the current
>> placement of zImage.
>>
>> So when zImage is loaded at the intended base of DRAM, things work fine.
>>
>> If the zImage is loaded close to the end of a 128 MB region, the
>> rounding would pick the start of that 128 MB region. However, the
>> _start symbol you are using will point to an address that is close to
>> the end of the 128 MB [given that it is inside zImage] so your logic
>> will pick an address that is much higher in memory.
>
> https://people.kernel.org/linusw/how-the-arm32-linux-kernel-decompresses
> https://people.kernel.org/linusw/how-the-arm32-kernel-starts
> are good reads.

Thanks for your information.

>
> Gr{oetje,eeting}s,
>
> Geert
>
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
>
> .
>