Re: [PATCH] perf: Fix RCU dereference check in perf_event_comm

From: Peter Zijlstra
Date: Mon Mar 26 2012 - 09:00:59 EST


On Thu, 2012-03-22 at 13:36 +0200, Ari Savolainen wrote:
> 22. maaliskuuta 2012 11.53 Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> kirjoitti:
> > On Thu, 2012-03-22 at 01:43 +0200, Ari Savolainen wrote:
> >> The warning below is printed when executing a command like
> >> sudo perf record su - user -c "echo hello"
> >>
> >> It's fixed by moving the call of perf_event_comm to be protected
> >> by the task lock.
> >
> > That seems like a rather poor solution since it increases the lock hold
> > time for no explained reason.
> >
> >> include/linux/cgroup.h:567 suspicious rcu_dereference_check() usage!
> >
> >> [<ffffffff8109be55>] lockdep_rcu_suspicious+0xe5/0x100
> >> [<ffffffff811131fa>] perf_event_comm+0x37a/0x4d0
> >
> > So where exactly is this, perf_event_comm_event() takes rcu_read_lock()
> > so I presume its before that.
>
> I think the warning comes from this source-level call path:
>
> perf_event_comm ->
> perf_event_enable_on_exec ->
> perf_cgroup_sched_out ->
> perf_cgroup_from_task ->
> task_subsys_state ->
> task_subsys_state_check
>
> It seems there that path does not take rcu_read_lock(). Where should
> rcu_read_lock/unlock be added? In perf_group_sched_out around the
> calls of perf_cgroup_from_task? Like this:

Ah, ok. So IIRC this too is not needed. As the comment near
perf_cgroup_from_task() says, we hold explicit references to the cgroup.

Ideally we'd come up with a better validation condition but all variants
I could come up with make the code ugly and might actually generate
worse code, the current true simply shuts it up.

Stephane any thoughts?

---
kernel/events/core.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a6a9ec4..e423261 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -240,7 +240,7 @@ static void perf_ctx_unlock(struct perf_cpu_context *cpuctx,
static inline struct perf_cgroup *
perf_cgroup_from_task(struct task_struct *task)
{
- return container_of(task_subsys_state(task, perf_subsys_id),
+ return container_of(task_subsys_state_check(task, perf_subsys_id, true),
struct perf_cgroup, css);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/