Re: [PATCH 3/5] arm: decompressor: define a new zImage tag

From: Lukasz Stelmach
Date: Tue Jun 02 2020 - 12:17:27 EST


It was <2020-06-01 pon 21:41>, when Russell King - ARM Linux admin wrote:
> On Mon, Jun 01, 2020 at 10:27:45PM +0200, Lukasz Stelmach wrote:
>> It was <2020-06-01 pon 19:25>, when Russell King - ARM Linux admin wrote:
>>> On Mon, Jun 01, 2020 at 06:19:52PM +0200, Lukasz Stelmach wrote:
>>>> It was <2020-06-01 pon 15:55>, when Russell King - ARM Linux admin wrote:
>>>>> On Mon, Jun 01, 2020 at 04:27:52PM +0200, Åukasz Stelmach wrote:
>>>>>> Add DCSZ tag which holds dynamic memory (stack, bss, malloc pool)
>>>>>> requirements of the decompressor code.
>>>>>
>>>>> Why do we need to know the stack and BSS size, when the userspace
>>>>> kexec tool doesn't need to know this to perform the same function.
>>>>
>>>>
>>>> kexec-tools load zImage as low in DRAM as possible and rely on two
>>>> assumptions:
>>>>
>>>> + the zImage will copy itself to make enough room for the kernel,
>>>> + sizeof(zImage+mem) < sizeof(kernel+mem), which is true because
>>>> of compression.
>>>>
>>>> DRAM start
>>>> + 0x8000
>>>>
>>>> zImage |-----------|-----|-------|
>>>> text+data bss stack
>>>>
>>>> text+data bss
>>>> kernel |---------------------|-------------------|
>>>>
>>>>
>>>> initrd |-initrd-|-dtb-|
>>>
>>> This is actually incorrect, because the decompressor will self-
>>> relocate itself to avoid the area that it is going to decompress
>>> into.
>>
>> I described the state right after kexec(8) invocation.
>
> Actually, you haven't, because this is _not_ how kexec(8) lays it
> out, as I attempted to detail further down in my reply.


Let me try to describe how I understand the code in kexec-tools
(commit 74c7c369).

--8<---------------cut here---------------start------------->8---
int zImage_arm_load(â, const char *buf, off_t len, â)
// buf - zImage
// len - size of zImage

unsigned int extra_size = 0x8000; /* TEXT_OFFSET */

kernel_mem_size = len + 4;

// locate a hole to fit zImage + 32kB as low as possible,
base = locate_hole(info, len + extra_size, 0, 0, ULONG_MAX, INT_MAX);

kernel_base = base + extra_size;

add_segment(info, buf, len, kernel_base, kernel_mem_size);
--8<---------------cut here---------------end--------------->8---

Therefore, zImage is loaded low and always requires relocation.

ram |--------------------------------------------------------------
zImage |----k_m_s----|
^
|
kernel_base â TEXT_OFFSET or higher

Next goes initrd

--8<---------------cut here---------------start------------->8---
kexec_arm_image_size = len * 5; // or passed on command line

// if the tag exists
kexec_arm_image_size = max(edata_size + bss_size,
edata_size + len); // len - zImage size + 64 kB

initrd_base = kernel_base + _ALIGN(kexec_arm_image_size, page_size);

add_segment(info, ramdisk_buf, initrd_size, initrd_base, initrd_size);
--8<---------------cut here---------------end--------------->8---

above whatever is bigger (kernel + kernel bss) OR (kernel + zImage + zImage mem).


ram |---------------------------------------------------------------
zImage |----k_m_s----| Where kexec loads zImage @kernel_base

|.............len * 5....................| Fallback
kernel |.....edata.....|...bss...| These are just calculations
zImage |.....len+....| zImage will copy itself here WHEN it runs

initrd |--initrd_size--|
dtb ^ |---|
|
initrd_base

DTB, of course, goes next

dtb_offset = initrd_base + initrd_size + page_size;


Stuff marked with "-" is actually loaded, "." are just calculations to
establish initrd_base.

>>> So, while the decompressor runs, in the above situation it
>>> ends up as:
>>>
>>>
>>> ram |------------------------------------------------------...
>>> text+data bss
>>> kernel |---------------------|-------------------|
>>> zImage |-----------|-----|-------|
>>> text+data bss stack+malloc
>
> Note here - if the initrd was placed as _you_ describe at the end
> of where the zImage ends up including its BSS, it would be
> corrupted by the stack and malloc space of the decompressor while
> running. Ergo, your description of how kexec(8) lays stuff out
> is incorrect.

Is my analysis above accurate now? Do I understand this?

As you noted, my intention is to load zImage after edata (dotted len+
above).

>>>>>> +struct arm_zimage_tag_dc {
>>>>>> + struct tag_header hdr;
>>>>>> + union {
>>>>>> +#define ZIMAGE_TAG_DECOMP_SIZE ARM_ZIMAGE_MAGIC4
>>>>>> + struct zimage_decomp_size {
>>>>>> + __le32 bss_size;
>>>>>> + __le32 stack_size;
>>>>>> + __le32 malloc_size;
>>>>>> + } decomp_size;
>>>
>>> You certainly don't need to know all this. All you need to know is
>>> how much space the decompressor requires after the end of the image,
>>> encompassing the BSS size, stack size and malloc size, which is one
>>> value.
>>
>> I agree. However, since we are not fighting here for every single byte,
>> I'd rather add them as separate values and make the tag, even if only
>> slightly, more future-proof.
>
> It doesn't make it more future-proof. What happens if we add something
> else, do we need to list it separately, and then change the kernel to
> accept the new value and maybe also kexec(8)? Or do we just say "the
> decompressor needs X many bytes after the image" and be done with it?
> The latter sounds way more future-proof to me.

You are right. I changed it to a single value. Done.

--
Åukasz Stelmach
Samsung R&D Institute Poland
Samsung Electronics

Attachment: signature.asc
Description: PGP signature