Re: [RFC PATCH v6 1/5] perf sched: sync state char array with the kernel

From: Ze Gao
Date: Thu Aug 03 2023 - 22:23:25 EST


On Thu, Aug 3, 2023 at 11:10 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Thu, 3 Aug 2023 04:33:48 -0400
> Ze Gao <zegao2021@xxxxxxxxx> wrote:
>
> > Update state char array and then remove unused and stale
> > macros, which are kernel internal representations and not
> > encouraged to use anymore.
> >
> > Signed-off-by: Ze Gao <zegao@xxxxxxxxxxx>
> > ---
> > tools/perf/builtin-sched.c | 13 +------------
> > 1 file changed, 1 insertion(+), 12 deletions(-)
> >
> > diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> > index 9ab300b6f131..8dc8f071721c 100644
> > --- a/tools/perf/builtin-sched.c
> > +++ b/tools/perf/builtin-sched.c
> > @@ -92,23 +92,12 @@ struct sched_atom {
> > struct task_desc *wakee;
> > };
> >
> > -#define TASK_STATE_TO_CHAR_STR "RSDTtZXxKWP"
> > +#define TASK_STATE_TO_CHAR_STR "RSDTtXZPI"
>
> Thinking about this more, this will always be wrong. Changing it just works
> for the kernel you made the change for, but if it is run on another kernel,
> it's broken again.

Indeed. There is no easy way to maintain backward compatibility unless
we stop using this bizarre 'prev_state' field. Basically all its users suffer
from this. That's why I believe this needs a fix to alert people does not
use 'prev_state' anymore.

> I actually wrote code once that basically just did a:
>
> struct trace_seq s;
>
> trace_seq_init(&s);
> tep_print_event(tep, &s, record, "%s", TEP_PRINT_INFO);
>
> then searched s.buffer for "prev_state=%s ", to find the state character.
>
> That's because the kernel should always be up to date (and why I said I
> needed that string in the print_fmt).

Turing to building the state char array from print fmt string dynamically
is a great idea. :)

> As perf has a tep handle, this could be a helper function to extract the
> state if needed, and get rind of relying on the above character array.

I'll figure out how to make it happen.

BTW, my last concern is that is there any better way to notice userspace to
avoid interpreting task state out of 'prev_state'. Because the awkward thing
happens again.

Thanks,
Ze

> -- Steve
>
>
> >
> > /* task state bitmask, copied from include/linux/sched.h */
> > #define TASK_RUNNING 0
> > #define TASK_INTERRUPTIBLE 1
> > #define TASK_UNINTERRUPTIBLE 2
> > -#define __TASK_STOPPED 4
> > -#define __TASK_TRACED 8
> > -/* in tsk->exit_state */
> > -#define EXIT_DEAD 16
> > -#define EXIT_ZOMBIE 32
> > -#define EXIT_TRACE (EXIT_ZOMBIE | EXIT_DEAD)
> > -/* in tsk->state again */
> > -#define TASK_DEAD 64
> > -#define TASK_WAKEKILL 128
> > -#define TASK_WAKING 256
> > -#define TASK_PARKED 512
> >
> > enum thread_state {
> > THREAD_SLEEPING = 0,
>