Re: +cgroups-add-functionality-to-read-write-lock-clone_thread-forking-per-threadgroup.patch added to -mm tree

From: Oleg Nesterov
Date: Fri Aug 21 2009 - 06:49:22 EST


In case I wasn't clear.

Let's suppose we have subthreads T1 and T2, and we have a reference to T1.
T1->thread_group->next == T2.

T1 dies, T1->thread_group->next is still T2.

T2 dies, rcu passed, its memory is freed and and re-used.
But T1->thread_group->next is still T2.

Now, we call threadgroup_fork_lock(T1), it sees T1->sighand == NULL and does

rcu_read_lock();
list_for_each_entry_rcu(T1->thread_group);

T1->thread_group->next points to nowhere.


Once again, I didn't actually read these patches, perhaps I missed something.

Oleg.

On 08/21, Oleg Nesterov wrote:
>
> On 08/20, Andrew Morton wrote:
> >
> > Subject: cgroups: add functionality to read/write lock CLONE_THREAD fork()ing per-threadgroup
> > From: Ben Blum <bblum@xxxxxxxxxx>
> >
> > Add an rwsem that lives in a threadgroup's sighand_struct (next to the
> > sighand's atomic count, to piggyback on its cacheline), and two functions
> > in kernel/cgroup.c (for now) for easily+safely obtaining and releasing it.
>
> Sorry. Currently I have no time to read these patched. Absolutely :/
>
> But the very first change I noticed outside of cgroups.[ch] looks very wrong,
>
> > +struct sighand_struct *threadgroup_fork_lock(struct task_struct *tsk)
> > +{
> > + struct sighand_struct *sighand;
> > + struct task_struct *p;
> > +
> > + /* tasklist lock protects sighand_struct's disappearance in exit(). */
> > + read_lock(&tasklist_lock);
> > + if (likely(tsk->sighand)) {
> > + /* simple case - check the thread we were given first */
> > + sighand = tsk->sighand;
> > + } else {
> > + sighand = NULL;
> > + /*
> > + * tsk is exiting; try to find another thread in the group
> > + * whose sighand pointer is still alive.
> > + */
> > + rcu_read_lock();
> > + list_for_each_entry_rcu(p, &tsk->thread_group, thread_group) {
>
> If ->sighand == NULL we can't use list_for_each_entry_rcu(->thread_group),
> and rcu_read_lock() can't help.
>
> The task was removed from ->thread_group, its ->next points to nowhere.
>
> list_for_rcu(head) can _only_ work if we can trust head->next: it should
> point either to "head" (list_empty), or to the valid entry.
>
> Please correct me if I missed something.
>
> Otherwise, please send the changes which touch the process-management
> code separately. And please do not forget to CC people who work with
> this code ;)
>
> Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/