Re: [PATCH cgroup/for-5.5] cgroup: remove cgroup_enable_task_cg_lists() optimization

From: Christian Brauner
Date: Mon Oct 28 2019 - 13:48:09 EST


On Mon, Oct 28, 2019 at 05:48:52PM +0100, Oleg Nesterov wrote:
> On 10/25, Christian Brauner wrote:
> >
> > On Fri, Oct 25, 2019 at 05:52:25PM +0200, Oleg Nesterov wrote:
> > > On 10/25, Christian Brauner wrote:
> > > >
> > > > On Fri, Oct 25, 2019 at 04:13:25PM +0200, Oleg Nesterov wrote:
> > > > > Almost every usage of task->flags (load or sore) can be reported as "data race".
> > > > >
> > > > > Say, you do
> > > > >
> > > > > if (task->flags & PF_KTHREAD)
> > > > >
> > > > > while this task does
> > > > >
> > > > > current->flags |= PF_FREEZER_SKIP;
> > > > > schedule().
> > > > >
> > > > > this is data race.
> > > >
> > > > Right, but I thought we agreed on WONTFIX in those scenarios?
> > > > The alternative is to READ_ONCE()/WRITE_ONCE() all of these.
> > >
> > > Well, in my opinion this is WONTFIX, but I won't argue if someone
> > > adds _ONCE to all of these. Same for task->state, exit_state, and
> > > more.
> >
> > Well, I honestly think that state and exit_state would make sense.
>
> Heh. Again, I am not arguing, but...
>
> OK, lets suppose we blindly add READ_ONCE() to every access of
> task->state/exit_state.
>
> Yes, this won't hurt and possibly can fix some bugs we are not aware of.

I wasn't planning or working on adding *_ONCE everywhere. ;)
I just think it makes sense as a preemptive strike since they are shared
(though mostly protected by locks anyway).

>
> However,
>
> > There already were issues that got fixed for example in 3245d6acab98
> > ("exit: fix race between wait_consider_task() and wait_task_zombie()")
>
> The change above can't fix the problem like this.

No argument about the code we discussed right here, for sure!

>
> It is not that this code lacked READ_ONCE(). I am sure me and others
> understood that this code can read ->exit_state more than once, just
> nobody noticed that in this case this is really wrong.
>
> IOW, if we simply change the code before 3245d6acab98 to use READ_ONCE()
> the code will be equally wrong, and
>
> > and as far as I understand this would also help kcsan to better detect
> > races.
>
> this change will simply hide the problem from kcsan.

I can't speak to that since the claim that read_once() helps them even
if it's not really doing anything. But maybe I misunderstood the
k{c,t}san manpage.

Christian