Re: [PATCH v3 11/19] unwind: Add deferred user space unwinding API

From: Andrii Nakryiko
Date: Thu Oct 31 2024 - 19:28:41 EST


On Thu, Oct 31, 2024 at 4:13 PM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
>
> On Thu, Oct 31, 2024 at 02:22:48PM -0700, Andrii Nakryiko wrote:
> > > Problem is, the unwinder doesn't know in advance which tasks will be
> > > unwound.
> > >
> > > Its first clue is unwind_user_register(), would it make sense for the
> > > caller to clarify whether all tasks need to be unwound or only a
> > > specific subset?
> > >
> > > Its second clue is unwind_user_deferred(), which is called for the task
> > > itself. But by then it's too late because it needs to access the
> > > per-task data from (potentially) irq context so it can't do a lazy
> > > allocation.
> > >
> > > I'm definitely open to ideas...
> >
> > The laziest thing would be to perform GFP_ATOMIC allocation, and if
> > that fails, oops, too bad, no stack trace for you (but, generally
> > speaking, no big deal). Advantages are clear, though, right? Single
> > pointer in task_struct, which most of the time will be NULL, so no
> > unnecessary overheads.
>
> GFP_ATOMIC is limited, I don't think we want the unwinder to trigger
> OOM.
>

So all task_structs on the system using 104 bytes more, *permanently*
and *unconditionally*, is not a concern, but lazy GFP_ATOMIC
allocation when you actually need it is?

> > It's the last point that's important to make usability so much
> > simpler, avoiding unnecessary custom timeouts and stuff like that.
> > Regardless whether stack trace capture is success or not, user is
> > guaranteed to get a "notification" about the outcome.
> >
> > Hope this helps.
> >
> > But basically, if I I called unwind_user_deferred(), I expect to get
> > some callback, guaranteed, with the result or failure. The only thing
> > that's not guaranteed (and which makes timeouts bad) is *when* this
> > will happen. Because stack trace capture can be arbitrarily delayed
> > and stuff. That's fine, but that also shows why timeout is tricky and
> > necessarily fragile.
>
> That sounds reasonable. In the OOM error case I can just pass a small
> (stack allocated) one-entry trace with only regs->ip.
>

SGTM

> --
> Josh
>