Re: [PATCH] exit: add trace_task_exit() tracepoint before current->mm is reset

From: Michal Hocko
Date: Wed Apr 02 2025 - 03:21:39 EST


On Tue 01-04-25 15:04:11, Andrii Nakryiko wrote:
> On Tue, Apr 1, 2025 at 2:31 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> > On Tue, 1 Apr 2025 11:40:21 -0700
> > Andrii Nakryiko <andrii@xxxxxxxxxx> wrote:
> >
> > Hi Andrii,
> >
> > > It is useful to be able to access current->mm to, say, record a bunch of
> > > VMA information right before the task exits (e.g., for stack
> > > symbolization reasons when dealing with short-lived processes that exit
> > > in the middle of profiling session). We currently do have
> > > trace_sched_process_exit() in the exit path, but it is called a bit too
> > > late, after exit_mm() resets current->mm to NULL, which makes it
> > > unsuitable for inspecting and recording task's mm_struct-related data
> > > when tracing process lifetimes.
> >
> > My fear of adding another task exit trace event is that it will get a
> > bit confusing as that we now have trace_sched_process_exit() and also
> > trace_task_exit() with slightly different semantics.
> >
> > How about adding a trace_exit_mm()? Add that to the exit_mm() code?
>
> This is kind of the worst of both worlds, no? We still have a new
> tracepoint, but this one can't tell if it's a `group_dead` situation
> or not... I can pass group_dead into exit_mm(), but it will be just
> for the sake of that new tracepoint.

Is it important to tell the difference between thread and the
whole process group exiting?

Please keep in mind that even group exit doesn't really imply the mm is
going away (clone allows CLONE_VM without CLONE_SIGNAL - i.e. mm could
be shared outside of thread group).
--
Michal Hocko
SUSE Labs