Re: [PATCH -tip 22/32] sched: Split the cookie and setup per-task cookie on fork

From: Peter Zijlstra
Date: Tue Dec 01 2020 - 14:22:45 EST


On Tue, Dec 01, 2020 at 02:11:33PM -0500, Joel Fernandes wrote:
> On Wed, Nov 25, 2020 at 12:15:41PM +0100, Peter Zijlstra wrote:
> > On Tue, Nov 17, 2020 at 06:19:52PM -0500, Joel Fernandes (Google) wrote:
> >
> > > +/*
> > > + * Ensure that the task has been requeued. The stopper ensures that the task cannot
> > > + * be migrated to a different CPU while its core scheduler queue state is being updated.
> > > + * It also makes sure to requeue a task if it was running actively on another CPU.
> > > + */
> > > +static int sched_core_task_join_stopper(void *data)
> > > +{
> > > + struct sched_core_task_write_tag *tag = (struct sched_core_task_write_tag *)data;
> > > + int i;
> > > +
> > > + for (i = 0; i < 2; i++)
> > > + sched_core_tag_requeue(tag->tasks[i], tag->cookies[i], false /* !group */);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int sched_core_share_tasks(struct task_struct *t1, struct task_struct *t2)
> > > +{
> >
> > > + stop_machine(sched_core_task_join_stopper, (void *)&wr, NULL);
> >
> > > +}
> >
> > This is *REALLY* terrible...
>
> I pulled this bit from your original patch. Are you concerned about the
> stop_machine? Sharing a core is a slow path for our usecases (and as far as I
> know, for everyone else's). We can probably do something different if that
> requirement changes.
>

Yeah.. so I can (and was planning on) remove stop_machine() from
sched_core_{dis,en}able() before merging it.

(there's two options, one uses stop_cpus() with the SMT mask, the other
RCU)

This though is exposing stop_machine() to joe user. Everybody is allowed
to prctl() it's own task and set a cookie on himself. This means you
just made giant unpriv DoS vector.

stop_machine is bad, really bad.