Re: [PATCH] sched: rt: Make RT capacity aware

From: Qais Yousef
Date: Tue Oct 08 2019 - 02:13:12 EST

Next message: Jani Nikula: "Re: New sysfs interface for privacy screens"
Previous message: Christoph Hellwig: "Re: [PATCH 09/11] xfs: remove the fork fields in the writepage_ctx and ioend"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 10/07/19 11:14, Dietmar Eggemann wrote:
> On 23/09/2019 13:52, Qais Yousef wrote:
> > On 09/20/19 14:52, Dietmar Eggemann wrote:
> >>> 2. The fallback mechanism means we either have to call cpupri_find()
> >>> twice once to find filtered lowest_rq and the other to return the
> >>> none filtered version.
> >>
> >> This is what I have in mind. (Only compile tested! ... and the 'if
> >> (cpumask_any(lowest_mask) >= nr_cpu_ids)' condition has to be considered
> >> as well):
> >>
> >> @@ -98,8 +103,26 @@ int cpupri_find(struct cpupri *cp, struct
> >> task_struct *p,
> >> continue;
> >>
> >> if (lowest_mask) {
> >> + int cpu, max_cap_cpu = -1;
> >> + unsigned long max_cap = 0;
> >> +
> >> cpumask_and(lowest_mask, p->cpus_ptr, vec->mask);
> >>
> >> + for_each_cpu(cpu, lowest_mask) {
> >> + unsigned long cap =
> >> arch_scale_cpu_capacity(cpu);
> >> +
> >> + if (!rt_task_fits_capacity(p, cpu))
> >> + cpumask_clear_cpu(cpu, lowest_mask);
> >> +
> >> + if (cap > max_cap) {
> >> + max_cap = cap;
> >> + max_cap_cpu = cpu;
> >> + }
> >> + }
> >> +
> >> + if (cpumask_empty(lowest_mask) && max_cap)
> >> + cpumask_set_cpu(max_cap_cpu, lowest_mask);
> >
> > I had a patch that I was testing but what I did is to continue rather than
> > return a max_cap_cpu.
>
> Continuing is the correct thing to do here. I just tried to illustrate
> the idea.
>
> > e.g:
> >
> > if no cpu at current priority fits the task:
> > continue;
> > else:
> > return the lowest_mask which contains fitting cpus only
> >
> > if no fitting cpu was find:
> > return 0;
>
> I guess this is what we want to achieve here. It's unavoidable that we
> will run sooner (compared to an SMP system) into a situation in which we
> have to go higher in the rd->cpupri->pri_to_cpu[] array or in which we
> can't return a lower mask at all.
>
> > Or we can tweak your approach to be
> >
> > if no cpu at current priority fits the task:
> > if the cpu the task is currently running on doesn't fit it:
> > return lowest_mask with max_cap_cpu set;
>
> I wasn't aware of the pri_to_cpu[] array and how cpupri_find(,
> lowest_mask) tries to return the lowest_mask of the lowest priority
> (pri_to_cpu[] index).
>
> > So we either:
> >
> > 1. Continue the search until we find a fitting CPU; bail out otherwise.
>
> If this describes the solution in which we concentrate the
> capacity-awareness in cpupri_find(), then I'm OK with it.
> find_lowest_rq() already favours task_cpu(task), this_cpu and finally
> cpus in sched_groups (from the viewpoint of task_cpu(task)).
>
> > 2. Or we attempt to return a CPU only if the CPU the task is currently
> > running on doesn't fit it. We don't want to migrate the task from a
> > fitting to a non-fitting one.
>
> I would prefer 1., keeping the necessary changes confined in
> cpupri_find() if possible.

We are in agreement then.

>
> > We can also do something hybrid like:
> >
> > 3. Remember the outcome of 2 but don't return immediately and attempt
> > to find a fitting CPU at a different priority level.
> >
> >
> > Personally I see 1 is the simplest and good enough solution. What do you think?
>
> Agreed. We would potentially need a fast lookup for p -> uclamp_cpumask
> though?

We can extend task_struct to store a cpumask of the cpus that fit the uclamp
settings and keep it up-to-date whenever the uclamp values change. I did
consider that but it seemed better to keep the implementation confined. I could
have been too conservative - so I'd be happy to look at that.

Thanks

--
Qais Yousef

>
> > I think this is 'continue' to search makes doing it at cpupri_find() more
> > robust than having to deal with whatever mask we first found in
> > find_lowest_rq() - so I'm starting to like this approach better. Thanks for
> > bringing it up.
>
> My main concern is that having rt_task_fits_capacity() added to almost
> every condition in the code makes it hard to understand what capacity
> awareness in RT wants to achieve.
>
> [...]

Next message: Jani Nikula: "Re: New sysfs interface for privacy screens"
Previous message: Christoph Hellwig: "Re: [PATCH 09/11] xfs: remove the fork fields in the writepage_ctx and ioend"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]