Re: [PATCH v2] sched: introduce sched_switch_post trace event

From: Peter Zijlstra
Date: Tue Jul 07 2015 - 02:46:15 EST


On Mon, Jul 06, 2015 at 12:15:45PM -0700, Cong Wang wrote:
> Currently we only have one sched_switch trace event
> for task switching, which is generated very early during
> task switch. When we try to monitor per-container perf
> events, this is not what we expect.
>
> For example, we have a process A which is in the cgroup
> we monitor, and process B which isn't, when kernel switches
> from B to A, the sched_switch event is not recorded for this
> cgroup since it belongs to B (current process is still B
> util we finish the switch), but we require this event to
> signal that process A in this cgroup gets scheduled. This is
> crucial for calculating schedule latency (like `perf sched`).
>
> Ideally, we need to split the sched_switch event into two:
> sched_in event before we perform the switch, and sched_out
> event after we perform the switch. However, for compatibility,
> we can not change the sched_switch event. So before we have
> trace event alias, we can just reuse sched_switch and introduce
> sched_switch_post event instead.

No.. its still horrible.

You're trying to solve perf problems with ftrace; this cannot work.

Does this patch by Adrian work for you? I think it solves this problem
and a bunch of others.

lkml.kernel.org/r/1435927962-32417-2-git-send-email-adrian.hunter@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/