Re: [RFC][PATCH 4/7] sched: Replace sd_busy/nr_busy_cpus with sched_domain_shared

From: Peter Zijlstra
Date: Wed May 11 2016 - 13:38:05 EST


On Wed, May 11, 2016 at 12:55:56PM +0100, Matt Fleming wrote:

> This breaks my POWER7 box which presumably doesn't have SD_SHARE_PKG_RESOURCES,

> index 978b3ef2d87e..d27153adee4d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7920,7 +7920,8 @@ static inline void set_cpu_sd_state_busy(void)
> goto unlock;
> sd->nohz_idle = 0;
>
> - atomic_inc(&sd->shared->nr_busy_cpus);
> + if (sd->shared)
> + atomic_inc(&sd->shared->nr_busy_cpus);
> unlock:
> rcu_read_unlock();
> }


Ah, no, the problem is that while it does have SHARE_PKG_RESOURCES (in
its SMT domain -- SMT threads share all cache after all), I failed to
connect the sched_domain_shared structure for it.

Does something like this also work?

---
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6389,10 +6389,6 @@ sd_init(struct sched_domain_topology_lev
sd->cache_nice_tries = 1;
sd->busy_idx = 2;

- sd->shared = *per_cpu_ptr(sdd->sds, sd_id);
- atomic_inc(&sd->shared->ref);
- atomic_set(&sd->shared->nr_busy_cpus, sd_weight);
-
#ifdef CONFIG_NUMA
} else if (sd->flags & SD_NUMA) {
sd->cache_nice_tries = 2;
@@ -6414,6 +6410,16 @@ sd_init(struct sched_domain_topology_lev
sd->idle_idx = 1;
}

+ /*
+ * For all levels sharing cache; connect a sched_domain_shared
+ * instance.
+ */
+ if (sd->flags & SH_SHARED_PKG_RESOURCES) {
+ sd->shared = *per_cpu_ptr(sdd->sds, sd_id);
+ atomic_inc(&sd->shared->ref);
+ atomic_set(&sd->shared->nr_busy_cpus, sd_weight);
+ }
+
sd->private = sdd;

return sd;