Re: [PATCH v3 1/3] sched/fair: Add asymmetric CPU capacity wakeup scan

From: Valentin Schneider
Date: Tue Jan 28 2020 - 06:30:33 EST


Hi Pavan,

On 28/01/2020 06:22, Pavan Kondeti wrote:
> Hi Valentin,
>
> On Sun, Jan 26, 2020 at 08:09:32PM +0000, Valentin Schneider wrote:
>>
>> +static inline int check_cpu_capacity(struct rq *rq, struct sched_domain *sd);
>> +
>> +/*
>> + * Scan the asym_capacity domain for idle CPUs; pick the first idle one on which
>> + * the task fits. If no CPU is big enough, but there are idle ones, try to
>> + * maximize capacity.
>> + */
>> +static int select_idle_capacity(struct task_struct *p, int target)
>> +{
>> + unsigned long best_cap = 0;
>> + struct sched_domain *sd;
>> + struct cpumask *cpus;
>> + int best_cpu = -1;
>> + struct rq *rq;
>> + int cpu;
>> +
>> + if (!static_branch_unlikely(&sched_asym_cpucapacity))
>> + return -1;
>> +
>> + sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, target));
>> + if (!sd)
>> + return -1;
>> +
>> + sync_entity_load_avg(&p->se);
>> +
>> + cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>> + cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
>> +
>> + for_each_cpu_wrap(cpu, cpus, target) {
>> + rq = cpu_rq(cpu);
>> +
>> + if (!available_idle_cpu(cpu))
>> + continue;
>> + if (task_fits_capacity(p, rq->cpu_capacity))
>> + return cpu;
>
> I have couple of questions.
>
> (1) Any particular reason for not checking sched_idle_cpu() as a backup
> for the case where all eligible CPUs are busy? select_idle_cpu() does
> that.
>

No particular reason other than we didn't consider it, I think. I don't see
any harm in folding it in, I'll do that for v4. I am curious however; are
you folks making use of SCHED_IDLE? AFAIA Android isn't making use of it
yet, though Viresh paved the way for that to happen.

> (2) Assuming all CPUs are busy, we return -1 from here and end up
> calling select_idle_cpu(). The traversal in select_idle_cpu() may be
> waste in cases where sd_llc == sd_asym_cpucapacity . For example SDM845.
> Should we worry about this?
>

Before v3, since we didn't have the fallback CPU selection within
select_idle_capacity(), we would need the fall-through to select_idle_cpu()
(we could've skipped an idle CPU just because its capacity wasn't high
enough).

That's not the case anymore, so indeed we may be able to bail out of
select_idle_sibling() right after select_idle_capacity() (or after the
prev / recent_used_cpu checks). Our only requirement here is that sd_llc
remains a subset of sd_asym_cpucapacity.

So far for Arm topologies we can have:
- sd_llc < sd_asym_cpucapacity (e.g. legacy big.LITTLE like Juno)
- sd_llc == sd_asym_cpucapacity (e.g. DynamIQ like SDM845)

I'm slightly worried about sd_llc > sd_asym_cpucapacity ever being an
actual thing - I don't believe it makes much sense, but that's not stopping
anyone.

AFAIA we (Arm) *currently* don't allow that with big.LITTLE or DynamIQ, nor
do I think it can happen with the default scheduler topology where MC is
the last possible level we can have as sd_llc.

So it *might* be a safe assumption - and I can still add a SCHED_WARN_ON().