Re: RE:[PATCH] sched: Add trace for task wake up latency and leave running time
From: peterz
Date: Thu Sep 03 2020 - 03:43:01 EST
On Wed, Sep 02, 2020 at 10:35:34PM +0000, gengdongjiu wrote:
> > NAK, that tracepoint is already broken, we don't want to proliferate the broken.
>
> Sorry, What the meaning that tracepoint is already broken?
Just that, the tracepoint is crap. But we can't fix it because ABI. Did
I tell you I utterly hate tracepoints?
> Maybe I need to explain the reason that why I add two trace point.
> when using perf tool or Ftrace sysfs to capture the task wake-up latency and the task leaving running queue time, usually the trace data is too large and the CPU utilization rate is too high in the process due to a lot of disk write. Sometimes even the disk is full, the issue still does not reproduced that above two time exceed a certain threshold. So I added two trace points, using filter we can only record the abnormal trace that includes wakeup latency and leaving running time larger than an threshold.
> Or do you have better solution?
Learn to use a MUA and wrap your lines at 78 chars like normal people.
Yes, use ftrace synthetic events, or bpf or really anything other than
this.
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c index
> > > 8471a0f7eb32..b5a1928dc948 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -2464,6 +2464,8 @@ static void ttwu_do_wakeup(struct rq *rq, struct
> > > task_struct *p, int wake_flags, {
> > > check_preempt_curr(rq, p, wake_flags);
> > > p->state = TASK_RUNNING;
> > > + p->ts_wakeup = local_clock();
> > > + p->wakeup_state = true;
> > > trace_sched_wakeup(p);
> > >
> > > #ifdef CONFIG_SMP
> >
> > NAK, userless overhead.
>
> When sched switch, we do not know the next task previous state and
> wakeup timestamp, so I record the task previous state if it is waken
> from sleep. And then it can calculate the wakeup latency when task
> switch.
I don't care. You're making things slower.