[RFC 0/6] optimize ctx switch with rb-tree

From: David Carrillo-Cisneros
Date: Tue Jan 10 2017 - 05:25:42 EST


Following the discussion in:
https://patchwork.kernel.org/patch/9420035/

This is is an early version of a series of perf context switches
optimizations.

The main idea is to create and maintain a list of inactive events sorted
by timestamp, and a rb-tree index to index it. The rb-tree's key are
{cpu,flexible,stamp} for task contexts and {cgroup,flexible,stamp}
for CPU contexts.

The rb-tree provides functions to find intervals in the inactive event
list so that ctx_sched_in only has to visit the events that can be
potentially be scheduled (i.e. avoid iterations over events bound
to CPUs or cgroups that are not current).

Since the inactive list is sort by timestamp, rotation can be done by
simply scheduling out and in the events. This implies that each timer
interrupt, the events will rotate by q events (where q is the number
of hardware counters). This changes the current behavior of rotation.
Feedback welcome!

I haven't profiled the new approach. I am only assuming it will be
superior when the number of per-cpu or distict cgroup events is large.

The last patch shows how perf_iterate_ctx can use the new rb-tree index
to reduce the number of visited events. I haven't looked carefully if
locking and other things are correct.

If this changes are in the right direction. A next version could remove
some existing code, specifically the lists ctx->pinned_groups and
ctx->flexible_groups could be removed. Also, event_filter_match could be
simplified when called on events groups filtered using the rb-tree, since
both perform similar checks.

David Carrillo-Cisneros (6):
perf/core: create active and inactive event groups
perf/core: add a rb-tree index to inactive_groups
perf/core: use rb-tree to sched in event groups
perf/core: avoid rb-tree traversal when no inactive events
perf/core: rotation no longer neccesary. Behavior has changed. Beware
perf/core: use rb-tree index to optimize filtered perf_iterate_ctx

include/linux/perf_event.h | 13 ++
kernel/events/core.c | 466 +++++++++++++++++++++++++++++++++++++++------
2 files changed, 426 insertions(+), 53 deletions(-)

--
2.11.0.390.gc69c2f50cf-goog