Re: [PATCH v9 2/7] kexec: define functions to map and unmap segments
From: Baoquan He
Date: Wed Mar 05 2025 - 07:25:24 EST
On 03/04/25 at 04:55pm, steven chen wrote:
> On 3/4/2025 2:23 PM, Jarkko Sakkinen wrote:
> > On Tue, Mar 04, 2025 at 11:03:46AM -0800, steven chen wrote:
> > > The content of memory segments carried over to the new kernel during the
> > > kexec systemcall can be changed at kexec 'execute' stage, but the size of
> > > the memory segments cannot be changed at kexec 'execute' stage.
> > >
> > > To copy IMA measurement logs during the kexec operation, IMA needs to
> > > allocate memory at the kexec 'load' stage and map the segments to the
> > > kimage structure. The mapped address will then be used to copy IMA
> > > measurements during the kexec 'execute' stage.
> > >
> > > Currently, the mechanism to map and unmap segments to the kimage
> > > structure is not available to subsystems outside of kexec.
> > How does IMA work with kexec without having this? Just interested
> > (and confused).
> Currently, all IMA-related operations during a soft reboot, such as memory
> allocation and IMA log list copy, are handled in the kexec 'load' stage, so
> the map/unmap mechanism is not required.
>
> The new design separates these two operations into different stages: memory
> allocation remains in the kexec 'load' stage, while the IMA log list copy is
> moved to the kexec 'execute' stage. Therefore, the map/unmap mechanism is
> introduced.
I think the log can be improved. About the found problem and solution
part, we possible can describe them like below:
===
Currently, the kernel behaviour of kexec load is the IMA measurements
log is fetched from TPM PCRs and stored into buffer and hold. When
kexec reboot is triggered, the stored log buffer is carried over to the
2nd kernel. However, the time gap between kexec load and kexec reboot
could be very long. Then those new events extended into TPM PCRs during
the time window misses the chance to be carried over to 2nd kernel. This
results in mismatch between TPM PCR quotes and the actual IMA measurements
list after kexec reboot, which in turn results in remote attestation
failure.
To solve this problem, the new design is to defer the reading TPM PCRs
content out into kexec buffer to kexec reboot phase. While still
allocating the necessary buffer at kexec load time because it's not
appropriate to allocate memory at kexec reboot moment.
===
It may still need be improved, just for your reference. You can change
and add more detail needed and add them into your log.
>
> Please refer to "[PATCH v9 0/7] ima: kexec: measure events between kexec
> load and execute" for the reason why to add this.
>
> Steven
>
> > > Implement kimage_map_segment() to enable IMA to map measurement log list to
> > > the kimage structure during kexec 'load' stage. This function takes a kimage
> > > pointer, a memory address, and a size, then gathers the
> > > source pages within the specified address range, creates an array of page
> > > pointers, and maps these to a contiguous virtual address range. The
> > > function returns the start virtual address of this range if successful, or NULL on
> > > failure.
> > >
> > > Implement kimage_unmap_segment() for unmapping segments
> > > using vunmap().
> > >
> > > From: Tushar Sugandhi <tusharsu@xxxxxxxxxxxxxxxxxxx>
> > > Signed-off-by: Tushar Sugandhi <tusharsu@xxxxxxxxxxxxxxxxxxx>
> > > Cc: Eric Biederman <ebiederm@xxxxxxxxxxxx>
> > > Cc: Baoquan He <bhe@xxxxxxxxxx>
> > > Cc: Vivek Goyal <vgoyal@xxxxxxxxxx>
> > > Cc: Dave Young <dyoung@xxxxxxxxxx>
> > > Signed-off-by: steven chen <chenste@xxxxxxxxxxxxxxxxxxx>
> > > ---
> > > include/linux/kexec.h | 6 +++++
> > > kernel/kexec_core.c | 54 +++++++++++++++++++++++++++++++++++++++++++
> > > 2 files changed, 60 insertions(+)
> > >
> > > diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> > > index f0e9f8eda7a3..7d6b12f8b8d0 100644
> > > --- a/include/linux/kexec.h
> > > +++ b/include/linux/kexec.h
> > > @@ -467,13 +467,19 @@ extern bool kexec_file_dbg_print;
> > > #define kexec_dprintk(fmt, arg...) \
> > > do { if (kexec_file_dbg_print) pr_info(fmt, ##arg); } while (0)
> > > +extern void *kimage_map_segment(struct kimage *image, unsigned long addr, unsigned long size);
> > > +extern void kimage_unmap_segment(void *buffer);
> > > #else /* !CONFIG_KEXEC_CORE */
> > > struct pt_regs;
> > > struct task_struct;
> > > +struct kimage;
> > > static inline void __crash_kexec(struct pt_regs *regs) { }
> > > static inline void crash_kexec(struct pt_regs *regs) { }
> > > static inline int kexec_should_crash(struct task_struct *p) { return 0; }
> > > static inline int kexec_crash_loaded(void) { return 0; }
> > > +static inline void *kimage_map_segment(struct kimage *image, unsigned long addr, unsigned long size)
> > > +{ return NULL; }
> > > +static inline void kimage_unmap_segment(void *buffer) { }
> > > #define kexec_in_progress false
> > > #endif /* CONFIG_KEXEC_CORE */
> > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > > index c0bdc1686154..63e4d16b6023 100644
> > > --- a/kernel/kexec_core.c
> > > +++ b/kernel/kexec_core.c
> > > @@ -867,6 +867,60 @@ int kimage_load_segment(struct kimage *image,
> > > return result;
> > > }
> > > +void *kimage_map_segment(struct kimage *image,
> > > + unsigned long addr, unsigned long size)
> > > +{
> > > + unsigned long eaddr = addr + size;
> > > + unsigned long src_page_addr, dest_page_addr;
> > > + unsigned int npages;
> > > + struct page **src_pages;
> > > + int i;
> > > + kimage_entry_t *ptr, entry;
> > > + void *vaddr = NULL;
When adding a new function, it's suggested to take the reverse xmas tree
style for local variable ordering usually.
> > > +
> > > + /*
> > > + * Collect the source pages and map them in a contiguous VA range.
> > > + */
> > > + npages = PFN_UP(eaddr) - PFN_DOWN(addr);
> > > + src_pages = kmalloc_array(npages, sizeof(*src_pages), GFP_KERNEL);
> > > + if (!src_pages) {
> > > + pr_err("Could not allocate ima pages array.\n");
> > > + return NULL;
> > > + }
> > > +
> > > + i = 0;
> > > + for_each_kimage_entry(image, ptr, entry) {
> > > + if (entry & IND_DESTINATION) {
> > > + dest_page_addr = entry & PAGE_MASK;
> > > + } else if (entry & IND_SOURCE) {
> > > + if (dest_page_addr >= addr && dest_page_addr < eaddr) {
> > > + src_page_addr = entry & PAGE_MASK;
> > > + src_pages[i++] =
> > > + virt_to_page(__va(src_page_addr));
> > > + if (i == npages)
> > > + break;
> > > + dest_page_addr += PAGE_SIZE;
> > > + }
> > > + }
> > > + }
> > > +
> > > + /* Sanity check. */
> > > + WARN_ON(i < npages);
> > > +
> > > + vaddr = vmap(src_pages, npages, VM_MAP, PAGE_KERNEL);
> > > + kfree(src_pages);
> > > +
> > > + if (!vaddr)
> > > + pr_err("Could not map ima buffer.\n");
> > > +
> > > + return vaddr;
> > > +}
> > > +
> > > +void kimage_unmap_segment(void *segment_buffer)
> > > +{
> > > + vunmap(segment_buffer);
> > > +}
> > > +
> > > struct kexec_load_limit {
> > > /* Mutex protects the limit count. */
> > > struct mutex mutex;
> > > --
> > > 2.25.1
> > >
> > >
> > BR, Jarkko
>
>