Re: [PATCH v9 06/11] arm64: kexec_file: allow for loading Image-format kernel

From: AKASHI Takahiro
Date: Mon May 21 2018 - 04:37:36 EST


James,

I haven't commented on this email.

On Tue, May 15, 2018 at 06:14:37PM +0100, James Morse wrote:
> Hi Akashi,
>
> On 15/05/18 06:13, AKASHI Takahiro wrote:
> > On Fri, May 11, 2018 at 06:07:06PM +0100, James Morse wrote:
> >> On 07/05/18 08:21, AKASHI Takahiro wrote:
> >>> On Tue, May 01, 2018 at 06:46:11PM +0100, James Morse wrote:
> >>>> On 25/04/18 07:26, AKASHI Takahiro wrote:
> >>>>> This patch provides kexec_file_ops for "Image"-format kernel. In this
> >>>>> implementation, a binary is always loaded with a fixed offset identified
> >>>>> in text_offset field of its header.
> >>
> >>>>> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> >>>>> index e4de1223715f..3cba4161818a 100644
> >>>>> --- a/arch/arm64/include/asm/kexec.h
> >>>>> +++ b/arch/arm64/include/asm/kexec.h
>
> >>>> Could we check branch_code is non-zero, and text-offset points within image-size?
> >>>
> >>> We could do it, but I don't think this check is very useful.
> >>>
> >>>>
> >>>> We could check that this platform supports the page-size/endian config that this
> >>>> Image was built with... We get a message from the EFI stub if the page-size
> >>>> can't be supported, it would be nice to do the same here (as we can).
> >>>
> >>> There is no restriction on page-size or endianness for kexec.
> >>
> >> No, but it won't boot if the hardware doesn't support it. The kernel will spin
> >> at a magic address that is, difficult, to debug without JTAG. The bug report
> >> will be "it didn't boot".
> >
> > OK.
> > Added sanity checks for cpu features, endianness as well as page size.
> >
> >>
> >>> What will be the purpose of this check?
> >>
> >> These values are in the header so that the bootloader can check them, then print
> >> a meaningful error. Here, kexec_file_load() is playing the part of the bootloader.
>
> >> I'm assuming kexec_file_load() can only be used to kexec linux... unlike regular
> >> kexec. Is this where I'm going wrong?
>
> Trying to work this out for myself: we can't support any UEFI application as we
> can't give it the boot-services environment, so I'm pretty sure
> kexec_file_load() must be linux-specific.
>
> Can we state somewhere that we only expect arm64 linux to be booted with
> kexec_file_load()? Its not clear from the kconfig text, which refers to kexec,
> which explicitly states it can boot other OS. But for kexec_file_load() we're
> following the kernel's booting.txt.

While I don't know anything about requirements in booting other OS's nor
if we can boot them even with kexec, I agree that kexec_file_load is a more
limited form of booting mechanism. I will add some statement in Kconfig.

> >>>>> diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
> >>>>> new file mode 100644
> >>>>> index 000000000000..4dd524ad6611
> >>>>> --- /dev/null
> >>>>> +++ b/arch/arm64/kernel/kexec_image.c
> >>>>> @@ -0,0 +1,79 @@
> >>>>
> >>>>> +static void *image_load(struct kimage *image,
> >>>>> + char *kernel, unsigned long kernel_len,
> >>>>> + char *initrd, unsigned long initrd_len,
> >>>>> + char *cmdline, unsigned long cmdline_len)
> >>>>> +{
> >>>>> + struct kexec_buf kbuf;
> >>>>> + struct arm64_image_header *h = (struct arm64_image_header *)kernel;
> >>>>> + unsigned long text_offset;
> >>>>> + int ret;
> >>>>> +
> >>>>> + /* Load the kernel */
> >>>>> + kbuf.image = image;
> >>>>> + kbuf.buf_min = 0;
> >>>>> + kbuf.buf_max = ULONG_MAX;
> >>>>> + kbuf.top_down = false;
> >>>>> +
> >>>>> + kbuf.buffer = kernel;
> >>>>> + kbuf.bufsz = kernel_len;
> >>>>> + kbuf.memsz = le64_to_cpu(h->image_size);
> >>>>> + text_offset = le64_to_cpu(h->text_offset);
> >>>>> + kbuf.buf_align = SZ_2M;
> >>>>
> >>>>> + /* Adjust kernel segment with TEXT_OFFSET */
> >>>>> + kbuf.memsz += text_offset;
> >>>>> +
> >>>>> + ret = kexec_add_buffer(&kbuf);
> >>>>> + if (ret)
> >>>>> + goto out;
> >>>>> +
> >>>>> + image->arch.kern_segment = image->nr_segments - 1;
> >>>>
> >>>> You only seem to use kern_segment here, and in load_other_segments() called
> >>>> below. Could it not be a local variable passed in? Instead of arch-specific data
> >>>> we keep forever?
> >>>
> >>> No, kern_segment is also used in load_other_segments() in machine_kexec_file.c.
> >>> To optimize memory hole allocation logic in locate_mem_hole_callback(),
> >>> we need to know the exact range of kernel image (start and end).
> >>
> >> That's the second user. My badly-made point is one calls the other, but passes
> >> the data via some until-kexec lifetime struct. (its not important, just an
> >> indicator this worked differently in the past and hasn't been cleaned up).
> >> I meant something like [0].
> >
> > OK, but instead of adding kern_seg, I want to change the interface to:
> >
> > | extern int load_other_segments(struct kimage *image,
> > | unsigned long kernel_load_addr, unsigned long kernel_size,
> > | char *initrd, unsigned long initrd_len,
> > | char *cmdline, unsigned long cmdline_len);
> >
> > This way, we will in future be able to address an issue I mentioned in
> > my previous e-mail. (If we support vmlinux, the kernel occupies two segments
> > for text and data, respectively.)
>
> Aha, its not from old-stuff, its for future-stuff!

I have vmlinux patch, but it is very unlikely for me to submit it :)

Thanks,
-Takahiro AKASHI

>
> James