Re: [PATCH v6 13/21] sched: Admit forcefully-affined tasks into SCHED_DEADLINE

From: Will Deacon
Date: Fri May 21 2021 - 06:37:43 EST


On Fri, May 21, 2021 at 10:39:32AM +0200, Juri Lelli wrote:
> On 21/05/21 08:15, Quentin Perret wrote:
> > On Friday 21 May 2021 at 07:25:51 (+0200), Juri Lelli wrote:
> > > On 20/05/21 19:01, Will Deacon wrote:
> > > > On Thu, May 20, 2021 at 02:38:55PM +0200, Daniel Bristot de Oliveira wrote:
> > > > > On 5/20/21 12:33 PM, Quentin Perret wrote:
> > > > > > On Thursday 20 May 2021 at 11:16:41 (+0100), Will Deacon wrote:
> > > > > >> Ok, thanks for the insight. In which case, I'll go with what we discussed:
> > > > > >> require admission control to be disabled for sched_setattr() but allow
> > > > > >> execve() to a 32-bit task from a 64-bit deadline task with a warning (this
> > > > > >> is probably similar to CPU hotplug?).
> > > > > >
> > > > > > Still not sure that we can let execve go through ... It will break AC
> > > > > > all the same, so it should probably fail as well if AC is on IMO
> > > > > >
> > > > >
> > > > > If the cpumask of the 32-bit task is != of the 64-bit task that is executing it,
> > > > > the admission control needs to be re-executed, and it could fail. So I see this
> > > > > operation equivalent to sched_setaffinity(). This will likely be true for future
> > > > > schedulers that will allow arbitrary affinities (AC should run on affinity
> > > > > change, and could fail).
> > > > >
> > > > > I would vote with Juri: "I'd go with fail hard if AC is on, let it
> > > > > pass if AC is off (supposedly the user knows what to do)," (also hope nobody
> > > > > complains until we add better support for affinity, and use this as a motivation
> > > > > to get back on this front).
> > > >
> > > > I can have a go at implementing it, but I don't think it's a great solution
> > > > and here's why:
> > > >
> > > > Failing an execve() is _very_ likely to be fatal to the application. It's
> > > > also very likely that the task calling execve() doesn't know whether the
> > > > program it's trying to execute is 32-bit or not. Consequently, if we go
> > > > with failing execve() then all that will happen is that people will disable
> > > > admission control altogether.
> >
> > Right, but only on these dumb 32bit asymmetric systems, and only if we
> > care about running 32bits deadline tasks -- which I seriously doubt for
> > the Android use-case.
> >
> > Note that running deadline tasks is also a privileged operation, it
> > can't be done by random apps.
> >
> > > > That has a negative impact on "pure" 64-bit
> > > > applications and so I think we end up with the tail wagging the dog because
> > > > admission control will be disabled for everybody just because there is a
> > > > handful of 32-bit programs which may get executed. I understand that it
> > > > also means that RT throttling would be disabled.
> > >
> > > Completely understand your perplexity. But how can the kernel still give
> > > guarantees to "pure" 64-bit applications if there are 32-bit
> > > applications around that essentially broke admission control when they
> > > were restricted to a subset of cores?
> > >
> > > > Allowing the execve() to continue with a warning is very similar to the
> > > > case in which all the 64-bit CPUs are hot-unplugged at the point of
> > > > execve(), and this is much closer to the illusion that this patch series
> > > > intends to provide.
> > >
> > > So, for hotplug we currently have a check that would make hotplug
> > > operations fail if removing a CPU would mean not enough bandwidth to run
> > > the currently admitted set of DEADLINE tasks.
> >
> > Aha, wasn't aware. Any pointers to that check for my education?
>
> Hotplug ends up calling dl_cpu_busy() (after the cpu being hotplugged out
> got removed), IIRC. So, if that fails the operation in undone.

Interesting, thanks. Thinking about this some more, it strikes me that with
these silly asymmetric systems there could be an interesting additional
problem with hotplug and deadline tasks. Imagine the following sequence of
events:

1. All online CPUs are 32-bit-capable
2. sched_setattr() admits a 32-bit deadline task
3. A 64-bit-only CPU is onlined
4. Some of the 32-bit-capable CPUs are offlined

I wonder if we can get into a situation where we think we have enough
bandwidth available, but in reality the 32-bit task is in trouble because
it can't make use of the 64-bit-only CPU.

If so, then it seems to me that admission control is really just
"best-effort" for 32-bit deadline tasks on these systems because it's based
on a snapshot in time of the available resources.

Will