Re: [PATCH v3 11/19] unwind: Add deferred user space unwinding API
From: Josh Poimboeuf
Date: Wed Oct 30 2024 - 13:48:03 EST
On Wed, Oct 30, 2024 at 09:44:14AM -0400, Mathieu Desnoyers wrote:
> What you want here is to move the point where you clear the task
> cookie to _after_ completion of stack unwind. There are a few ways
> this can be done:
>
> A) Clear the task cookie in the task_work() right after the
> unwind_user_deferred() is completed. Downside: if some long task work
> happen to be done after the stack walk, a new unwind_user_deferred()
> could be issued again and we may end up looping forever taking stack
> unwind and never actually making forward progress.
>
> B) Clear the task cookie after the exit_to_user_mode_loop is done,
> before returning to user-space, while interrupts are disabled.
Problem is, if another tracer calls unwind_user_deferred() for the first
time, after the task work but before the task cookie gets cleared, it
will see the cookie is non-zero and will fail to schedule another task
work. So its callback never gets called.
> > If I change the entry code to increment a per-task counter instead of a
> > per-cpu counter then this problem goes away. I was just concerned about
> > the performance impacts of doing that on every entry.
>
> Moving from per-cpu to per-task makes this cookie task-specific and not
> global anymore, I don't think we want this for a stack walking
> infrastructure meant to be used by various tracers. Also a global cookie
> is more robust and does not depend on guaranteeing that all the
> trace data is present to guarantee current thread ID accuracy and
> thus that cookies match between deferred unwind request and their
> fulfillment.
I don't disagree. What I meant was, on entry (or exit), increment the
task cookie *with* the CPU bits included.
Or as you suggested previously, the "cookie" just be a struct with two
fields: CPU # and per-task entry counter.
--
Josh