Re: [PATCH -tip] cpuacct: Make cpuacct hierarchy walk incpuacct_charge() safe when rcupreempt is used.

From: Peter Zijlstra
Date: Thu Mar 19 2009 - 05:20:44 EST


On Tue, 2009-03-17 at 13:06 +0530, Bharata B Rao wrote:
> On Tue, Mar 17, 2009 at 02:28:11PM +0800, Li Zefan wrote:
> > Bharata B Rao wrote:
> > > cpuacct: Make cpuacct hierarchy walk in cpuacct_charge() safe when
> > > rcupreempt is used.
> > >
> > > cpuacct_charge() obtains task's ca and does a hierarchy walk upwards.
> > > This can race with the task's movement between cgroups. This race
> > > can cause an access to freed ca pointer in cpuacct_charge(). This will not
> >
> > Actually it can also end up access invalid tsk->cgroups. ;)
> >
> > get tsk->cgroups (cg)
> > (move tsk to another cgroup) or (tsk exiting)
> > -> kfree(tsk->cgroups)
> > get cg->subsys[..]
>
> Ok :) Here is the patch again with updated description.
>
> cpuacct: Make cpuacct hierarchy walk in cpuacct_charge() safe when
> rcupreempt is used.
>
> cpuacct_charge() obtains task's ca and does a hierarchy walk upwards.
> This can race with the task's movement between cgroups. This race
> can cause an access to freed ca pointer in cpuacct_charge() or access
> to invalid cgroups pointer of the task. This will not happen with rcu or
> tree rcu as cpuacct_charge() is called with preemption disabled. However if
> rcupreempt is used, the race is seen. Thanks to Li Zefan for explaining this.
>
> Fix this race by explicitly protecting ca and the hierarchy walk with
> rcu_read_lock().
>
> Signed-off-by: Bharata B Rao <bharata@xxxxxxxxxxxxxxxxxx>

I would ditch the comment, it doesn't add anything.

The simple rule is: if you want RCU-safe, use rcu_read_lock().
preempt/irq disable isn't sufficient -- hasn't been for a long long
while.

After that,

Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>

> ---
> kernel/sched.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -9891,6 +9891,13 @@ static void cpuacct_charge(struct task_s
> return;
>
> cpu = task_cpu(tsk);
> +
> + /*
> + * preemption is already disabled here, but to be safe with
> + * rcupreempt, take rcu_read_lock(). This protects ca and
> + * hence the hierarchy walk.
> + */
> + rcu_read_lock();
> ca = task_ca(tsk);
>
> do {
> @@ -9898,6 +9905,7 @@ static void cpuacct_charge(struct task_s
> *cpuusage += cputime;
> ca = ca->parent;
> } while (ca);
> + rcu_read_unlock();
> }
>
> struct cgroup_subsys cpuacct_subsys = {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/