Re: [Fwd: [-mm] Add an owner to the mm_struct (v9)]

From: Oleg Nesterov
Date: Tue Apr 15 2008 - 14:08:26 EST


On 04/14, Andrew Morton wrote:
>
> On Mon, 14 Apr 2008 19:43:11 +0530
> Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
>
> > +void mm_update_next_owner(struct mm_struct *mm)
> > +{
> > + struct task_struct *c, *g, *p = current;
> > +
> > +retry:
> > + if (!mm_need_new_owner(mm, p))
> > + return;
> > +
> > + read_lock(&tasklist_lock);
> > + /*
> > + * Search in the children
> > + */
> > + list_for_each_entry(c, &p->children, sibling) {
> > + if (c->mm == mm)
> > + goto assign_new_owner;
> > + }
> > +
> > + /*
> > + * Search in the siblings
> > + */
> > + list_for_each_entry(c, &p->parent->children, sibling) {
> > + if (c->mm == mm)
> > + goto assign_new_owner;
> > + }
> > +
> > + /*
> > + * Search through everything else. We should not get
> > + * here often
> > + */
> > + do_each_thread(g, c) {
> > + if (c->mm == mm)
> > + goto assign_new_owner;
> > + } while_each_thread(g, c);
> > +
> > + read_unlock(&tasklist_lock);
>
> Potentially-long tasklist_lock hold times are a concern. I don't suppose
> rcu can save us?

I guess rcu can't help...

> Some additional commentary fleshing out "We should not get here often"
> might set minds at ease. How not-often? Under which circumstances?

Oh, I don't know what cgroup is, at all, but this looks really strange.

What about use_mm()? We can choose a kernel thread, but unuse_mm() doesn't
try to change ->owner...

Let's suppose the process with a lot of threads does exit_group() and nobody
else uses this ->mm. How many time we will re-assign mm->owner and iterate
over the all threads in system ?


Perhaps, we can add mm_struct->mm_user_list instead? In that case mm->owner
becomes first_entry()...

> > +assign_new_owner:
> > + BUG_ON(c == p);
> > + get_task_struct(c);
> > + /*
> > + * The task_lock protects c->mm from changing.
> > + * We always want mm->owner->mm == mm
> > + */
> > + task_lock(c);
> > + /*
> > + * Delay read_unlock() till we have the task_lock()
> > + * to ensure that c does not slip away underneath us
> > + */
> > + read_unlock(&tasklist_lock);

You can drop tasklist_lock right after get_task_struct(), the nested locks
are not preempt-friendly.

> > + if (c->mm != mm) {
> > + task_unlock(c);
> > + put_task_struct(c);
> > + goto retry;
> > + }
> > + cgroup_mm_owner_callbacks(mm->owner, c);

Can't we avoid calling cgroup_mm_owner_callbacks() at least when
mm->owner->cgroups == c->cgroups ?

Minor, but perhaps cgroup_mm_owner_callbacks() should check ->mm_owner_changed
!= NULL first, then play with task_cgroup()...

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/