Re: [PATCH v6 13/21] sched: Admit forcefully-affined tasks into SCHED_DEADLINE

From: Quentin Perret
Date: Fri May 21 2021 - 04:16:01 EST


On Friday 21 May 2021 at 07:25:51 (+0200), Juri Lelli wrote:
> On 20/05/21 19:01, Will Deacon wrote:
> > On Thu, May 20, 2021 at 02:38:55PM +0200, Daniel Bristot de Oliveira wrote:
> > > On 5/20/21 12:33 PM, Quentin Perret wrote:
> > > > On Thursday 20 May 2021 at 11:16:41 (+0100), Will Deacon wrote:
> > > >> Ok, thanks for the insight. In which case, I'll go with what we discussed:
> > > >> require admission control to be disabled for sched_setattr() but allow
> > > >> execve() to a 32-bit task from a 64-bit deadline task with a warning (this
> > > >> is probably similar to CPU hotplug?).
> > > >
> > > > Still not sure that we can let execve go through ... It will break AC
> > > > all the same, so it should probably fail as well if AC is on IMO
> > > >
> > >
> > > If the cpumask of the 32-bit task is != of the 64-bit task that is executing it,
> > > the admission control needs to be re-executed, and it could fail. So I see this
> > > operation equivalent to sched_setaffinity(). This will likely be true for future
> > > schedulers that will allow arbitrary affinities (AC should run on affinity
> > > change, and could fail).
> > >
> > > I would vote with Juri: "I'd go with fail hard if AC is on, let it
> > > pass if AC is off (supposedly the user knows what to do)," (also hope nobody
> > > complains until we add better support for affinity, and use this as a motivation
> > > to get back on this front).
> >
> > I can have a go at implementing it, but I don't think it's a great solution
> > and here's why:
> >
> > Failing an execve() is _very_ likely to be fatal to the application. It's
> > also very likely that the task calling execve() doesn't know whether the
> > program it's trying to execute is 32-bit or not. Consequently, if we go
> > with failing execve() then all that will happen is that people will disable
> > admission control altogether.

Right, but only on these dumb 32bit asymmetric systems, and only if we
care about running 32bits deadline tasks -- which I seriously doubt for
the Android use-case.

Note that running deadline tasks is also a privileged operation, it
can't be done by random apps.

> > That has a negative impact on "pure" 64-bit
> > applications and so I think we end up with the tail wagging the dog because
> > admission control will be disabled for everybody just because there is a
> > handful of 32-bit programs which may get executed. I understand that it
> > also means that RT throttling would be disabled.
>
> Completely understand your perplexity. But how can the kernel still give
> guarantees to "pure" 64-bit applications if there are 32-bit
> applications around that essentially broke admission control when they
> were restricted to a subset of cores?
>
> > Allowing the execve() to continue with a warning is very similar to the
> > case in which all the 64-bit CPUs are hot-unplugged at the point of
> > execve(), and this is much closer to the illusion that this patch series
> > intends to provide.
>
> So, for hotplug we currently have a check that would make hotplug
> operations fail if removing a CPU would mean not enough bandwidth to run
> the currently admitted set of DEADLINE tasks.

Aha, wasn't aware. Any pointers to that check for my education?

> > So, personally speaking, I would prefer the behaviour where we refuse to
> > admit 32-bit tasks vioa sched_set_attr() if the root domain contains
> > 64-bit CPUs, but we _don't_ fail execve() of a 32-bit program from a
> > 64-bit deadline task.
>
> OK, this is interesting and I guess a very valid alternative. That would
> force users to create exclusive domains for 32-bit tasks, right?

FWIW this is not practical at all for our use-cases, the implications of
splitting the system in independent root-domains are way too important
for us to be able to recommend that. Disabling AC, OTOH, sounds simple
enough. The RT throttling part is the only 'worrying' part, but even
that may not be the end of the world.

Thanks!
Quentin