Re: [RFC PATCH 3/4] sched/topology: remove smt_gain

From: Srikar Dronamraju
Date: Tue Sep 04 2018 - 06:37:55 EST


* Vincent Guittot <vincent.guittot@xxxxxxxxxx> [2018-09-04 11:36:26]:

> Hi Srikar,
>
> Le Tuesday 04 Sep 2018 à 01:24:24 (-0700), Srikar Dronamraju a écrit :
> > However after this change, capacity_orig of each SMT thread would be
> > 1024. For example SMT 8 core capacity_orig would now be 8192.
> >
> > smt_gain was suppose to make a multi threaded core was slightly more
> > powerful than a single threaded core. I suspect if that sometimes hurt
>
> Is there system with both single threaded and multi threaded core ?
> That was the main open point for me (and for Qais too)
>

I dont know of any systems that have come with single threaded and
multithreaded. However some user can still offline few threads in a core
while leaving other cores untouched. I dont really know why somebody
would want to do it. For example, some customer was toying with SMT 3
mode in a SMT 8 power8 box.

>
> > us when doing load balance between 2 cores i.e at MC or DIE sched
> > domain. Even with 2 threads running on a core, the core might look
> > lightly loaded 2048/8192. Hence might dissuade movement to a idle core.
>
> Then, there is the sibling flag at SMT level that normally ensures 1 task per
> core for such UC
>

Agree.

> >
> > I always wonder why arch_scale_cpu_capacity() is called with NULL
> > sched_domain, in scale_rt_capacity(). This way capacity might actually
>
> Probably because until this v4.19-rcxx version, the rt scaling was done
> relatively to local cpu capacity:
> capacity  = arch_scale_cpu() * scale_rt_capacity / SCHED_CAPACITY_SCALE
>
> Whereas now, it directly returns the remaining capacity
>
> > be more than the capacity_orig. I am always under an impression that
> > capacity_orig > capacity. Or am I misunderstanding that?
>
> You are right, there is a bug for SMT and the patch below should fix it.
> Nevertheless, we still have the problem in some other places in the code.
>
> Subject: [PATCH] sched/fair: fix scale_rt_capacity() for SMT
>
> Since commit:
> commit 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
> scale_rt_capacity() returns the remaining capacity and not a scale factor
> to apply on cpu_capacity_orig. arch_scale_cpu() is directly called by
> scale_rt_capacity() so we must take the sched_domain argument
>
> Fixes: 523e979d3164 ("sched/core: Use PELT for scale_rt_capacity()")
> Reported-by: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>

Reviewed-by: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx>

--
Thanks and Regards
Srikar Dronamraju