Re: [PATCH -tip] cpuacct: Make cpuacct hierarchy walk incpuacct_charge() safe when rcupreempt is used.
From: Balbir Singh
Date: Wed Mar 18 2009 - 05:37:07 EST
* Bharata B Rao <bharata@xxxxxxxxxxxxxxxxxx> [2009-03-18 08:48:32]:
> On Tue, Mar 17, 2009 at 06:42:51PM +0530, Balbir Singh wrote:
> > * Bharata B Rao <bharata@xxxxxxxxxxxxxxxxxx> [2009-03-17 13:06:49]:
> >
> > > On Tue, Mar 17, 2009 at 02:28:11PM +0800, Li Zefan wrote:
> > > > Bharata B Rao wrote:
> > > > > cpuacct: Make cpuacct hierarchy walk in cpuacct_charge() safe when
> > > > > rcupreempt is used.
> > > > >
> > > > > cpuacct_charge() obtains task's ca and does a hierarchy walk upwards.
> > > > > This can race with the task's movement between cgroups. This race
> > > > > can cause an access to freed ca pointer in cpuacct_charge(). This will not
> > > >
> > > > Actually it can also end up access invalid tsk->cgroups. ;)
> > > >
> > > > get tsk->cgroups (cg)
> > > > (move tsk to another cgroup) or (tsk exiting)
> > > > -> kfree(tsk->cgroups)
> > > > get cg->subsys[..]
> > >
> > > Ok :) Here is the patch again with updated description.
> > >
> > > cpuacct: Make cpuacct hierarchy walk in cpuacct_charge() safe when
> > > rcupreempt is used.
> > >
> > > cpuacct_charge() obtains task's ca and does a hierarchy walk upwards.
> > > This can race with the task's movement between cgroups. This race
> > > can cause an access to freed ca pointer in cpuacct_charge() or access
> > > to invalid cgroups pointer of the task. This will not happen with rcu or
> > > tree rcu as cpuacct_charge() is called with preemption disabled. However if
> > > rcupreempt is used, the race is seen. Thanks to Li Zefan for explaining this.
> > >
> > > Fix this race by explicitly protecting ca and the hierarchy walk with
> > > rcu_read_lock().
> > >
> >
> > Looks good and works very well (except for the batch issue that you
> > pointed out, it takes up to batch values before updates are seen).
> >
> > I'd like to get the patches in -tip and see the results, I would
> > recommend using percpu_counter_sum() while reading the data as an
> > enhancement to this patch. If user space does not overwhelm with a lot
> > of reads, sum would work out better.
> >
> >
> > Tested-by: Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx>
> > Acked-by: Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx>
>
> So I guess this ack is not for this patch but for the per-cgroup
> stime/utime cpuacct controller statistics patch.
>
Yes.. for both these patches actually. Thanks for pointing it out
though.
--
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/