Re: [PATCH RFC] sched: Add a per-thread core scheduling interface

From: Peter Zijlstra
Date: Thu May 28 2020 - 13:04:10 EST


On Sun, May 24, 2020 at 10:00:46AM -0400, Phil Auld wrote:
> On Fri, May 22, 2020 at 05:35:24PM -0400 Joel Fernandes wrote:
> > On Fri, May 22, 2020 at 02:59:05PM +0200, Peter Zijlstra wrote:
> > [..]
> > > > > It doens't allow tasks for form their own groups (by for example setting
> > > > > the key to that of another task).
> > > >
> > > > So for this, I was thinking of making the prctl pass in an integer. And 0
> > > > would mean untagged. Does that sound good to you?
> > >
> > > A TID, I think. If you pass your own TID, you tag yourself as
> > > not-sharing. If you tag yourself with another tasks's TID, you can do
> > > ptrace tests to see if you're allowed to observe their junk.
> >
> > But that would require a bunch of tasks agreeing on which TID to tag with.
> > For example, if 2 tasks tag with each other's TID, then they would have
> > different tags and not share.

Well, don't do that then ;-)

> > What's wrong with passing in an integer instead? In any case, we would do the
> > CAP_SYS_ADMIN check to limit who can do it.

So the actual permission model can be different depending on how broken
the hardware is.

> > Also, one thing CGroup interface allows is an external process to set the
> > cookie, so I am wondering if we should use sched_setattr(2) instead of, or in
> > addition to, the prctl(2). That way, we can drop the CGroup interface
> > completely. How do you feel about that?
> >
>
> I think it should be an arbitrary 64bit value, in both interfaces to avoid
> any potential reuse security issues.
>
> I think the cgroup interface could be extended not to be a boolean but take
> the value. With 0 being untagged as now.

How do you avoid reuse in such a huge space? That just creates yet
another problem for the kernel to keep track of who is who.

With random u64 numbers, it even becomes hard to determine if you're
sharing at all or not.

Now, with the current SMT+MDS trainwreck, any sharing is bad because it
allows leaking kernel privates. But under a less severe thread scenario,
say where only user data would be at risk, the ptrace() tests make
sense, but those become really hard with random u64 numbers too.

What would the purpose of random u64 values be for cgroups? That only
replicates the problem of determining uniqueness there. Then you can get
two cgroups unintentionally sharing because you got lucky.

Also, fundamentally, we cannot have more threads than TID space, it's a
natural identifier.