Re: [PATCH v4] sched: fix init NOHZ_IDLE flag
From: Vincent Guittot
Date: Wed Feb 27 2013 - 11:46:00 EST
On 27 February 2013 17:13, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> On Wed, Feb 27, 2013 at 09:28:26AM +0100, Vincent Guittot wrote:
>> > Ok I don't like having a per cpu state in struct sched domain but for
>> > now I can't find anything better. So my suggestion is that we do this
>> > and describe well the race, define the issue in the changelog and code
>> > comments and explain how we are solving it. This way at least the
>> > issue is identified and known. Then later, on review or after the
>> > patch is upstream, if somebody with some good taste comes with a
>> > better idea, we consider it.
>> >
>> > What do you think?
>>
>> I don't have better solution than adding this state in the
>> sched_domain if we want to keep the exact same behavior. This will be
>> a bit of waste of mem because we don't need to update all sched_domain
>> level (1st level is enough).
>
> Or you can try something like the below. Both flags and sched_domain share the same
> object here so the same RCU lifecycle. And there shouldn't be more overhead there
> since accessing rq->sd_rq.sd is the same than rq->sd_rq in the ASM level: only
> one pointer to dereference.
your proposal solves the waste of memory and keeps the sync between
flag and nr_busy. I'm going to try it
Thanks
>
> Also rq_idle becomes a separate value from rq->nohz_flags. It's a simple boolean
> (just making it an int here because boolean size are a bit opaque, although they
> are supposed to be char, let's just avoid surprises in structures).
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index cc03cfd..16c0d55 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -417,7 +417,10 @@ struct rq {
>
> #ifdef CONFIG_SMP
> struct root_domain *rd;
> - struct sched_domain *sd;
> + struct sched_domain_rq {
> + struct sched_domain sd;
> + int rq_idle;
> + } __rcu *sd_rq;
>
> unsigned long cpu_power;
>
> @@ -505,9 +508,14 @@ DECLARE_PER_CPU(struct rq, runqueues);
>
> #ifdef CONFIG_SMP
>
> -#define rcu_dereference_check_sched_domain(p) \
> - rcu_dereference_check((p), \
> - lockdep_is_held(&sched_domains_mutex))
> +#define rcu_dereference_check_sched_domain(p) ({\
> + struct sched_domain_rq *__sd_rq = rcu_dereference_check((p), \
> + lockdep_is_held(&sched_domains_mutex)); \
> + if (!__sd_rq) \
> + NULL; \
> + else \
> + &__sd_rq->sd; \
> +})
>
> /*
> * The domain tree (rq->sd) is protected by RCU's quiescent state transition.
> @@ -517,7 +525,7 @@ DECLARE_PER_CPU(struct rq, runqueues);
> * preempt-disabled sections.
> */
> #define for_each_domain(cpu, __sd) \
> - for (__sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd); \
> + for (__sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd_rq); \
> __sd; __sd = __sd->parent)
>
> #define for_each_lower_domain(sd) for (; sd; sd = sd->child)
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/