Re: [RFC] [Patch 0/21] Non disruptive application core dumpinfrastructure
From: KAMEZAWA Hiroyuki
Date: Tue Dec 14 2010 - 20:10:33 EST
On Tue, 14 Dec 2010 15:22:59 +0530
"Suzuki K. Poulose" <suzuki@xxxxxxxxxx> wrote:
> Hi all,
>
> This is series of patches implementing an infrastructure for capturing the core
> of an application without disrupting its process semantics.
>
> The infrastructure makes use of the freezer subsystem in kernel to freeze the
> threads and then collect the information to generate the core.
>
> The interface is provided by a /proc/pid/core file, reading which can give the
> ELF formatted core of the process with "pid". The interface supports "seek"
> operation on the fd, allowing the dumper to have control on the data that is
> being dumped. Also it allows the user to store the dump at any location.
>
> The current implementation supports both native as well as the compat ELF
> tasks.
>
> An open() call to the /proc/pid/core will try to freeze the threads in the
> process and the read() requests will dynamically generate the contents for the
> core file. The ELF header & Program Headers are stored in a kernel buffer to
> allow us to map the fpos to the required data section.
>
> In case a thread is not frozen within a time interval, after issuing the freeze
> request, we fill the register state information with 0's to indicate we could
> not capture the data.
>
> A close() would kick the threads out of the refrigerator().
>
>
> The implementation reuses some of the existing ELF core generation code by
> exporting them. Some of the code common to both native and compat ELF class
> support has been moved to a common place, elfcore-common.c. Also some of the
> reusable functions, specific to the ELF class handling, has been made global,
> after renaming the compat version of the same.
>
> We also added a new API -elf_core_copy_extra_phdrs() -for "reading" the arch
> specific program headers, versus the existing elf_core_write_extra_phdrs().
>
> Patches 1 to 9 deals with re-arranging the ELF code to be reusable by the
> infrastructure.
>
> Patches 10 to 21 implements the infrastructure.
>
> TODO: Add support for collecting the arch specific notes, currently used only
> by Cell platform.
>
> Please let me know your review comments / thoughts.
>
Your purpose of this patch is to debug an application without attaching to gdb
or take coredump by gcore ?
IIUC, "freeze" is a bit dangerous because no one can ends the application while
it's freezed and there is no information "it's frozen" via usaual user commands
as 'ps' or 'top'.
Can you add a new freeze state where the application can get SIGKILL,
at least ? and show task's state as "frozen" in some way ? as
task_state_array[] shows it in /proc/<pid>/status
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/