Re: [RFC PATCH 03/11] sched: Add sched tracepoints for RV task model

From: Gabriele Monaco
Date: Thu Feb 06 2025 - 03:37:00 EST




On Thu, 2025-02-06 at 09:19 +0100, Peter Zijlstra wrote:
> On Thu, Feb 06, 2025 at 09:09:39AM +0100, Gabriele Monaco wrote:
>
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 9632e3318e0d6..af9fa18035c71 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -226,12 +226,14 @@ struct user_event_mm;
> >  #define
> > __set_current_state(state_value) \
> >   do
> > { \
> >   debug_normal_state_change((state_value));
> > \
> > + trace_set_current_state(state_value);
> > \
> >   WRITE_ONCE(current->__state,
> > (state_value)); \
> >   } while (0)
> >  
> >  #define
> > set_current_state(state_value) \
> >   do
> > { \
> >   debug_normal_state_change((state_value));
> > \
> > + trace_set_current_state(state_value);
> > \
> >   smp_store_mb(current->__state,
> > (state_value)); \
> >   } while (0)
> >  
> > @@ -247,6 +249,7 @@ struct user_event_mm;
> >  
> > \
> >   raw_spin_lock_irqsave(&current->pi_lock,
> > flags); \
> >   debug_special_state_change((state_value));
> > \
> > + trace_set_current_state(state_value);
> > \
> >   WRITE_ONCE(current->__state,
> > (state_value)); \
> >   raw_spin_unlock_irqrestore(&current->pi_lock,
> > flags); \
> >   } while (0)
> > @@ -282,6 +285,7 @@ struct user_event_mm;
> >   raw_spin_lock(&current-
> > >pi_lock); \
> >   current->saved_state = current-
> > >__state; \
> >   debug_rtlock_wait_set_state();
> > \
> > + trace_set_current_state(TASK_RTLOCK_WAIT);
> > \
> >   WRITE_ONCE(current->__state,
> > TASK_RTLOCK_WAIT); \
> >   raw_spin_unlock(&current-
> > >pi_lock); \
> >   } while (0);
> > @@ -291,6 +295,7 @@ struct user_event_mm;
> >   lockdep_assert_irqs_disabled();
> > \
> >   raw_spin_lock(&current-
> > >pi_lock); \
> >   debug_rtlock_wait_restore_state();
> > \
> > + trace_set_current_state(TASK_RUNNING);
> > \
> >   WRITE_ONCE(current->__state, current-
> > >saved_state); \
> >   current->saved_state =
> > TASK_RUNNING; \
> >   raw_spin_unlock(&current-
> > >pi_lock); \
>
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 165c90ba64ea9..fb5f8aa61ef5d 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -491,6 +491,12 @@ sched_core_dequeue(struct rq *rq, struct
> > task_struct *p, int flags) { }
> >  
> >  #endif /* CONFIG_SCHED_CORE */
> >  
> > +void trace_set_current_state(int state_value)
> > +{
> > + trace_sched_set_state_tp(current, current->__state,
> > state_value);
> > +}
> > +EXPORT_SYMBOL(trace_set_current_state);
>
> Urgh, why !?!
>

Do you mean why exporting it?
At first I was puzzled too (this line is borrowed from Daniel), but the
thing is: set_current_state and friends are macros called by any sort
of code (e.g. modules), this seems the easiest way without messing up
with the current code.

I'm not sure if it would be cleaner to just drop this function and
define it directly in the header (including also trace/events/sched.h
there). It felt like files including sched shouldn't know about
tracepoints.

What do you think would be better?