Re: [PATCH 3/6] cpumask: truncate task_struct.cpus_allowed forCONFIG_CPUMASK_OFFSTACK

From: Ingo Molnar
Date: Mon Nov 23 2009 - 18:14:58 EST



* Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:

> On Tue, 24 Nov 2009 04:53:07 am Ingo Molnar wrote:
> >
> > * Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:
> >
> > > Turns cpus_allowed into a bitmap, and truncate it to nr_cpu_ids if
> > > CONFIG_CPUMASK_OFFSTACK is set.
> > >
> > > I do this rather than the classic [0] dangling array trick, because of
> > > INIT_TASK and references to sizeof(struct task_struct).
> > >
> > > Signed-off-by: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
> > > ---
> > > include/linux/init_task.h | 2 +-
> > > include/linux/sched.h | 7 +++++--
> > > kernel/fork.c | 19 ++++++++++++++++++-
> > > 3 files changed, 24 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> > > --- a/include/linux/init_task.h
> > > +++ b/include/linux/init_task.h
> > > @@ -130,7 +130,7 @@ extern struct cred init_cred;
> > > .static_prio = MAX_PRIO-20, \
> > > .normal_prio = MAX_PRIO-20, \
> > > .policy = SCHED_NORMAL, \
> > > - .cpus_allowed = CPU_MASK_ALL, \
> > > + .cpus_allowed = CPU_BITS_ALL, \
> > > .mm = NULL, \
> > > .active_mm = &init_mm, \
> > > .se = { \
> > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > --- a/include/linux/sched.h
> > > +++ b/include/linux/sched.h
> > > @@ -1256,7 +1256,6 @@ struct task_struct {
> > > #endif
> > >
> > > unsigned int policy;
> > > - cpumask_t cpus_allowed;
> > >
> > > #ifdef CONFIG_TREE_PREEMPT_RCU
> > > int rcu_read_lock_nesting;
> > > @@ -1544,10 +1543,14 @@ struct task_struct {
> > > unsigned long trace_recursion;
> > > #endif /* CONFIG_TRACING */
> > > unsigned long stack_start;
> > > +
> > > + /* This has to go at the end: if CONFIG_CPUMASK_OFFSTACK=y, only
> > > + * nr_cpu_ids bits will actually be allocated. */
> > > + DECLARE_BITMAP(cpus_allowed, CONFIG_NR_CPUS);
> >
> > (nit: please use the comment style you see elsewhere in the file.)
>
> Sure, my mistake.
>
> > > };
> > >
> > > /* Future-safe accessor for struct task_struct's cpus_allowed. */
> > > -#define tsk_cpumask(tsk) (&(tsk)->cpus_allowed)
> > > +#define tsk_cpumask(tsk) (to_cpumask((tsk)->cpus_allowed))
> >
> > Please use tsk_cpus_allowed() throughout - so that people who knew what
> > p->cpus_allowed did know what this new thing does.
>
> OK. I chose this because it's shorter and consistent with other uses
> (esp. mm_cpumask). It's been in Linus' tree for a while, but noone uses
> it so it's easy to rename.

Thanks!

> > Also, i'm still having second thoughts about it all - could we somehow
> > avoid all this wrappery of commonly used fields? (My main and pretty
> > much only worry is that struct field members look so much cleaner than
> > some wrapper intermediary.)
>
> I'm not a fan either, but it does avoid a flag-day transition. And I was
> pleasantly surprised how few places access ->cpus_allowed.
>
> If we use a cpumask_var_t, we still need to change all the callers
> (since it'll now look like a ptr), but we get a gratuitous ptr deref
> for CONFIG_CPUMASK_OFFSTACK=y.
>
> INIT_TASK needs some tweaking, but it's not a showstopper AFAICT.
>
> Want me to create a cpumask_var_t version for comparison?

There's also the problem of cache footprint - we really want to freedom
to place the masks (which are not off-stack in the regular case)
anywhere in task struct - and task-struct is huge, so cache placement
matters.

Is the ptr deref really a big problem for off-stack?

Changing all users isnt a problem as long as it still looks like a clean
data structure with clean usage. Those end-of-struct tricks we play with
cpumasks make me really a bit uneasy.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/