Re: [PATCH -mm 1/2] kexec jump -v12: kexec jump

From: Huang Ying
Date: Tue Jul 08 2008 - 21:04:47 EST


On Tue, 2008-07-08 at 10:50 -0400, Vivek Goyal wrote:
> On Mon, Jul 07, 2008 at 11:25:22AM +0800, Huang Ying wrote:
> > This patch provides an enhancement to kexec/kdump. It implements
> > the following features:
> >
> > - Backup/restore memory used by the original kernel before/after
> > kexec.
> >
> > - Save/restore CPU state before/after kexec.
> >
>
> Hi Huang,
>
> In general this patch set looks good enough to live in -mm and
> get some testing going.
>
> To me, adding capability to return back to original kernel looks
> like a logical extension to kexec functionality.
>
> Acked-by: Vivek Goyal <vgoyal@xxxxxxxxxx>
>
> Few minor comments inline.

Thank you very much!

> [..]
> > --- a/arch/x86/kernel/machine_kexec_32.c
> > +++ b/arch/x86/kernel/machine_kexec_32.c
> > @@ -22,6 +22,7 @@
> > #include <asm/cpufeature.h>
> > #include <asm/desc.h>
> > #include <asm/system.h>
> > +#include <asm/cacheflush.h>
> >
> > #define PAGE_ALIGNED __attribute__ ((__aligned__(PAGE_SIZE)))
> > static u32 kexec_pgd[1024] PAGE_ALIGNED;
> > @@ -85,10 +86,12 @@ static void load_segments(void)
> > * reboot code buffer to allow us to avoid allocations
> > * later.
> > *
> > - * Currently nothing.
> > + * Make control page executable.
> > */
> > int machine_kexec_prepare(struct kimage *image)
> > {
> > + if (nx_enabled)
> > + set_pages_x(image->control_code_page, 1);
> > return 0;
> > }
> >
> > @@ -98,16 +101,24 @@ int machine_kexec_prepare(struct kimage
> > */
> > void machine_kexec_cleanup(struct kimage *image)
> > {
> > + if (nx_enabled)
> > + set_pages_nx(image->control_code_page, 1);
> > }
> >
> > /*
> > * Do not allocate memory (or fail in any way) in machine_kexec().
> > * We are past the point of no return, committed to rebooting now.
> > */
> > -NORET_TYPE void machine_kexec(struct kimage *image)
> > +void machine_kexec(struct kimage *image)
> > {
> > unsigned long page_list[PAGES_NR];
> > void *control_page;
> > + asmlinkage unsigned long
> > + (*relocate_kernel_ptr)(unsigned long indirection_page,
> > + unsigned long control_page,
> > + unsigned long start_address,
> > + unsigned int has_pae,
> > + unsigned int preserve_context);
> >
> > tracer_disable();
> >
> > @@ -115,10 +126,11 @@ NORET_TYPE void machine_kexec(struct kim
> > local_irq_disable();
> >
> > control_page = page_address(image->control_code_page);
> > - memcpy(control_page, relocate_kernel, PAGE_SIZE);
> > + memcpy(control_page, relocate_kernel, PAGE_SIZE/2);
> >
>
> Is it possible to add either a compile time or run time check
> somewhere to make sure code in relocate_kernel.S does not exceed
> PAGE_SIZE/2.

OK, I will add it.

> [..]
> > --- a/kernel/kexec.c
> > +++ b/kernel/kexec.c
> > @@ -24,6 +24,8 @@
> > #include <linux/utsrelease.h>
> > #include <linux/utsname.h>
> > #include <linux/numa.h>
> > +#include <linux/suspend.h>
> > +#include <linux/device.h>
> >
> > #include <asm/page.h>
> > #include <asm/uaccess.h>
> > @@ -242,6 +244,12 @@ static int kimage_normal_alloc(struct ki
> > goto out;
> > }
> >
> > + image->swap_page = kimage_alloc_control_pages(image, 0);
> > + if (!image->swap_page) {
> > + printk(KERN_ERR "Could not allocate swap buffer\n");
> > + goto out;
> > + }
> > +
> > result = 0;
> > out:
> > if (result == 0)
> > @@ -986,6 +994,8 @@ asmlinkage long sys_kexec_load(unsigned
> > if (result)
> > goto out;
> >
> > + if (flags & KEXEC_PRESERVE_CONTEXT)
> > + image->preserve_context = 1;
> > result = machine_kexec_prepare(image);
> > if (result)
> > goto out;
> > @@ -1411,3 +1421,50 @@ static int __init crash_save_vmcoreinfo_
> > }
> >
> > module_init(crash_save_vmcoreinfo_init)
> > +
> > +/**
> > + * kernel_kexec - reboot the system
> > + *
> > + * Move into place and start executing a preloaded standalone
> > + * executable. If nothing was preloaded return an error.
> > + */
> > +int kernel_kexec(void)
> > +{
> > + int error = 0;
> > +
> > + if (xchg(&kexec_lock, 1))
> > + return -EBUSY;
> > + if (!kexec_image) {
> > + error = -EINVAL;
> > + goto Unlock;
> > + }
> > +
> > + if (kexec_image->preserve_context) {
> > +#ifdef CONFIG_KEXEC_JUMP
> > + local_irq_disable();
> > + save_processor_state();
> > +#endif
> > + } else {
> > + blocking_notifier_call_chain(&reboot_notifier_list,
> > + SYS_RESTART, NULL);
> > + system_state = SYSTEM_RESTART;
> > + device_shutdown();
> > + sysdev_shutdown();
> > + printk(KERN_EMERG "Starting new kernel\n");
> > + machine_shutdown();
>
> All the above code was part of kernel_restart_prepare(), can't we just
> make that function non-static and use that?

OK, I will do that.

> [..]
> > --- a/arch/x86/kernel/relocate_kernel_32.S
> > +++ b/arch/x86/kernel/relocate_kernel_32.S
> > @@ -20,11 +20,44 @@
> > #define PAGE_ATTR (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY)
> > #define PAE_PGD_ATTR (_PAGE_PRESENT)
> >
> > +/* control_page + PAGE_SIZE/2 ~ control_page + PAGE_SIZE * 3/4 are
> > + * used to save some data for jumping back
> > + */
> > +#define DATA(offset) (PAGE_SIZE/2+(offset))
> > +
> > +/* Minimal CPU state */
> > +#define ESP DATA(0x0)
> > +#define CR0 DATA(0x4)
> > +#define CR3 DATA(0x8)
> > +#define CR4 DATA(0xc)
> > +
> > +/* other data */
> > +#define CP_VA_CONTROL_PAGE DATA(0x10)
> > +#define CP_PA_PGD DATA(0x14)
> > +#define CP_PA_SWAP_PAGE DATA(0x18)
> > +#define CP_PA_BACKUP_PAGES_MAP DATA(0x1c)
> > +
>
> In general, this assembly piece of code is getting bigger and its
> difficult to read it now. I think we should at-least pull out the page
> table setup code into C. Somebody had posted a patch to do that. Don't
> know what happened to that. Anyway, this is a separate issue and is on
> wish list.

In fact, that patch was posted by me. I will re-post that patch.

Best Regards,
Huang Ying


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/