Re: [RFC] [Patch 0/21] Non disruptive application core dumpinfrastructure
From: Suzuki K. Poulose
Date: Wed Dec 15 2010 - 00:24:20 EST
On Wed, 15 Dec 2010 10:04:37 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> On Tue, 14 Dec 2010 15:22:59 +0530
> "Suzuki K. Poulose" <suzuki@xxxxxxxxxx> wrote:
>
> > Hi all,
> >
> > This is series of patches implementing an infrastructure for capturing the core
> > of an application without disrupting its process semantics.
> >
> > The infrastructure makes use of the freezer subsystem in kernel to freeze the
> > threads and then collect the information to generate the core.
> >
> > The interface is provided by a /proc/pid/core file, reading which can give the
> > ELF formatted core of the process with "pid". The interface supports "seek"
> > operation on the fd, allowing the dumper to have control on the data that is
> > being dumped. Also it allows the user to store the dump at any location.
> >
> > The current implementation supports both native as well as the compat ELF
> > tasks.
> >
> > An open() call to the /proc/pid/core will try to freeze the threads in the
> > process and the read() requests will dynamically generate the contents for the
> > core file. The ELF header & Program Headers are stored in a kernel buffer to
> > allow us to map the fpos to the required data section.
> >
> > In case a thread is not frozen within a time interval, after issuing the freeze
> > request, we fill the register state information with 0's to indicate we could
> > not capture the data.
> >
> > A close() would kick the threads out of the refrigerator().
> >
> >
> > The implementation reuses some of the existing ELF core generation code by
> > exporting them. Some of the code common to both native and compat ELF class
> > support has been moved to a common place, elfcore-common.c. Also some of the
> > reusable functions, specific to the ELF class handling, has been made global,
> > after renaming the compat version of the same.
> >
> > We also added a new API -elf_core_copy_extra_phdrs() -for "reading" the arch
> > specific program headers, versus the existing elf_core_write_extra_phdrs().
> >
> > Patches 1 to 9 deals with re-arranging the ELF code to be reusable by the
> > infrastructure.
> >
> > Patches 10 to 21 implements the infrastructure.
> >
> > TODO: Add support for collecting the arch specific notes, currently used only
> > by Cell platform.
> >
> > Please let me know your review comments / thoughts.
> >
>
> Your purpose of this patch is to debug an application without attaching to gdb
> or take coredump by gcore ?
The purpose is to take the coredump in a more reliable way without affecting
the process semantics.
>
> IIUC, "freeze" is a bit dangerous because no one can ends the application while
> it's freezed and there is no information "it's frozen" via usaual user commands
> as 'ps' or 'top'.
>
> Can you add a new freeze state where the application can get SIGKILL,
> at least ? and show task's state as "frozen" in some way ? as
> task_state_array[] shows it in /proc/<pid>/status
I will investigate this approach.
Thanks
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/