Re: [PATCH v2] perf/core: set cgroup in cpu contexts for new cgroup events
From: Vegard Nossum
Date: Wed Aug 03 2016 - 12:01:01 EST
Hi,
On 2 August 2016 at 09:48, David Carrillo-Cisneros <davidcc@xxxxxxxxxx> wrote:
> There is an optimization in perf_cgroup_sched_{in,out} that skips the
> switch of cgroup events if the old and new cgroups in a task switch are
> the same. This optimization interacts with the current code in two ways
> that cause a cpu context's cgroup (cpuctx->cgrp) to be NULL even if a
> cgroup event matches the current task. These are:
>
> 1. On creation of the first cgroup event in a CPU: In current code,
> cpuctx->cpu is only set in perf_cgroup_sched_in, but due to the
> aforesaid optimization, perf_cgroup_sched_in will run until the next
> cgroup switches in that cpu. This may happen late or never happen,
> depending on system's number of cgroups, cpu load, etc.
>
> 2. On deletion of the last cgroup event in a cpuctx: In list_del_event,
> cpuctx->cgrp is set NULL. Any new cgroup event will not be sched in
> because cpuctx->cgrp == NULL until a cgroup switch occurs and
> perf_cgroup_sched_in is executed (updating cpuctx->cgrp).
>
> This patch fixes both problems by setting cpuctx->cgrp in list_add_event,
> mirroring what list_del_event does when removing a cgroup event from CPU
> context, as introduced in:
> commit 68cacd29167b ("perf_events: Fix stale ->cgrp pointer in
> update_cgrp_time_from_cpuctx()")
>
> With this patch, cpuctx->cgrp is always set/clear when installing/removing
> the first/last cgroup event in/from the cpu context. With cpuctx->cgrp
> correctly set, event_filter_match works as intended when events are
> sched in/out.
>
> The problem is easy to observe in a machine with only one cgroup:
>
> $ perf stat -e cycles -I 1000 -C 0 -G /
> # time counts unit events
> 1.000161699 <not counted> cycles /
> 2.000355591 <not counted> cycles /
> 3.000565154 <not counted> cycles /
> 4.000951350 <not counted> cycles /
>
> After the fix, the output is as expected:
>
> $ perf stat -e cycles -I 1000 -a -G /
> # time counts unit events
> 1.004699159 627342882 cycles /
> 2.007397156 615272690 cycles /
> 3.010019057 616726074 cycles /
>
> Rebased at peterz/queue/perf/core.
>
> Changes in v2:
> - Fix build error when no CONFIG_CGROUP_PERF.
> - Unify add and del cases into list_update_cgroup_event.
> - Remove cgroup exclusive variables from builds
> without CONFIG_CGROUP_PERF.
>
> Signed-off-by: David Carrillo-Cisneros <davidcc@xxxxxxxxxx>
Is this supposed to fix the bug I reported here?
https://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1197757.html
If so, you may want to add:
1) Fixes: f2fb6bef92514 ("perf/core: Optimize side-band event delivery")
2) Reported-by: Vegard Nossum <vegard.nossum@xxxxxxxxxx>
3) a link to the thread
and I can give it a test to see if it fixes the problem I was running into.
If not, please ignore :-)
Thanks,
Vegard