Re: [RFC 00/60] Coscheduling for Linux

From: Frederic Weisbecker
Date: Fri Nov 23 2018 - 11:29:09 EST


On Thu, Sep 27, 2018 at 11:36:34AM -0700, Subhra Mazumdar wrote:
>
>
> On 09/26/2018 02:58 AM, Jan H. Schönherr wrote:
> >On 09/17/2018 02:25 PM, Peter Zijlstra wrote:
> >>On Fri, Sep 14, 2018 at 06:25:44PM +0200, Jan H. Schönherr wrote:
> >>
> >>>Assuming, there is a cgroup-less solution that can prevent simultaneous
> >>>execution of tasks on a core, when they're not supposed to. How would you
> >>>tell the scheduler, which tasks these are?
> >>Specifically for L1TF I hooked into/extended KVM's preempt_notifier
> >>registration interface, which tells us which tasks are VCPUs and to
> >>which VM they belong.
> >>
> >>But if we want to actually expose this to userspace, we can either do a
> >>prctl() or extend struct sched_attr.
> >Both, Peter and Subhra, seem to prefer an interface different than cgroups
> >to specify what to coschedule.
> >
> >Can you provide some extra motivation for me, why you feel that way?
> >(ignoring the current scalability issues with the cpu group controller)
> >
> >
> >After all, cgroups where designed to create arbitrary groups of tasks and
> >to attach functionality to those groups.
> >
> >If we were to introduce a different interface to control that, we'd need to
> >introduce a whole new group concept, so that you make tasks part of some
> >group while at the same time preventing unauthorized tasks from joining a
> >group.
> >
> >
> >I currently don't see any wins, just a loss in flexibility.
> >
> >Regards
> >Jan
> I think cgroups will the get the job done for any use case. But we have,
> e.g. affinity control via both sched_setaffinity and cgroup cpusets. It
> will be good to have an alternative way to specify co-scheduling too for
> those who don't want to use cgroup for some reason. It can be added later
> on though, only how one will override the other will need to be sorted out.

I kind of agree with Jan here that this is just going to add yet another task
group mechanism, very similar to the existing one, with runqueues inside and all.

Can you imagine kernel/sched/fair.c now dealing with both groups implementations?
What happens when cgroup task groups and cosched sched groups don't match wrt.
their tasks, their priorities, etc...

I understand cgroup task group has become infamous. But it may be a better idea
in the long run to fix it.