Re: [RFC PATCH] tracing: add sched_prio_update
From: Peter Zijlstra
Date: Thu Aug 11 2016 - 06:43:23 EST
On Mon, Jul 04, 2016 at 03:46:04PM -0400, Julien Desfossez wrote:
> The effective priority of running threads can also be temporarily
> changed in the PI code, but a dedicated tracepoint is already in place
> to cover this case.
So while we have that tracepoint its not really all that useful.
I would suggest removing it and replacing it with a combination of this
tracepoint and a new blocked-on tracepoint that would let us trace the
entire blocking graph (we could even include !PI primitives).
> Here are a few output examples:
> After fork of a normal task:
> sched_prio_update: comm=bash pid=2104, policy=SCHED_NORMAL, nice=0,
> rt_priority=0, dl_runtime=0, dl_deadline=0, dl_period=0
>
> renice -n 10 of a normal task:
> sched_prio_update: comm=sleep pid=2130, policy=SCHED_NORMAL, nice=10,
> rt_priority=0, dl_runtime=0, dl_deadline=0, dl_period=0
>
> SCHED_FIFO 60:
> sched_prio_update: comm=chrt pid=2105, policy=SCHED_FIFO, nice=0,
> rt_priority=60, dl_runtime=0, dl_deadline=0, dl_period=0
>
> SCHED_RR 60:
> sched_prio_update: comm=chrt pid=2109, policy=SCHED_RR, nice=0,
> rt_priority=60, dl_runtime=0, dl_deadline=0, dl_period=0
>
> SCHED_DEADLINE:
> sched_prio_update: comm=b pid=2110, policy=SCHED_DEADLING, nice=0,
> rt_priority=0, dl_runtime=10000000, dl_deadline=30000000,
> dl_period=30000000
Looks OK.
> +++ b/kernel/fork.c
> @@ -1773,6 +1773,7 @@ long _do_fork(unsigned long clone_flags,
> struct pid *pid;
>
> trace_sched_process_fork(current, p);
> + trace_sched_prio_update(p);
>
> pid = get_task_pid(p, PIDTYPE_PID);
> nr = pid_vnr(pid);
I'm a bit torn on this, like Steven I loathe back to back tracepoints.
Also, per a minimalist argument, we don't need this tracepoint, since a
child will inherit the parents attributes, except in the
SCHED_FLAG_RESET_ON_FORK case.
At the same time, this delta approach to state has the problem that at
the start of tracing we know nothing, which makes it hard to interpret
traces.
Adding the tracepoint here cures it for new tasks in the trace, but
doesn't help anything for pre-existing tasks.
I do feel this is something we need to cure; because I often trace bits
during the 'running' phase of a program and would not get anything.
Suggestions?
So while I have no objections, I would really rather like to see a more
complete approach before moving on this.