Re: [PATCH v10 2/5] sched: CGroup tagging interface for core scheduling

From: Chris Hyser
Date: Thu Feb 04 2021 - 15:56:29 EST


On 2/4/21 8:57 AM, Peter Zijlstra wrote:
On Fri, Jan 22, 2021 at 08:17:01PM -0500, Joel Fernandes (Google) wrote:
+/* Request the scheduler to share a core */
+#define PR_SCHED_CORE_SHARE 59
+# define PR_SCHED_CORE_CLEAR 0 /* clear core_sched cookie of pid */
+# define PR_SCHED_CORE_SHARE_FROM 1 /* get core_sched cookie from pid */
+# define PR_SCHED_CORE_SHARE_TO 2 /* push core_sched cookie to pid */

Why ?

The simplest interface would be a single 'set' command that specifies and sets a cookie value. Using 0 as a special value could then clear it. However, an early requirement that people seemed to agree with, is that cookies should be opaque and system guaranteed unique except when explicitly shared. Thus, since two tasks cannot share a cookie by explicitly setting the same cookie value, the prctl() must provide for a means of cookie sharing between tasks. The v9 proposal had incorporated all of this into a single "from-only" command whose actions depended on the state of the two tasks. If neither have a cookie and one shares from the other, they both get the same new cookie. If the calling task had one and the other didn't, the calling task's cookie was cleared. And of course if the src task has a cookie, the caller just gets it. Does a lot, tad bit overloaded, and still insufficient.

A second complication was a decision that new processes (not threads) do not inherit their parents cookie. Thus forking is also not a means to share a cookie. Basically with a "from-only" interface, the new task would need to be modified to call prctl() itself. From-only also does not allow for setting a cookie on an unmodified already running task. This can be fixed by providing both a "to" and "from" sharing interface that allows helper programs to construct arbitrary configurations from unmodified programs.
Also, how do I set a unique cookie on myself with this interface?

The v10 patch still uses the overloaded v9 mechanism (which as mentioned above is if two tasks w/o cookies share a cookie they get a new shared unique cookie). Yes, that is clearly an inconsistency and kludgy. The mechanism is documented in the docs, but clearly not obvious from the interface above. I think we got a bit overzealous in patch squashing and much of this verbiage should have been in the combined commit message.

So based on the above, how about we add a "create" to pair with "clear" and call it "create" vs "set" since we are creating a unique opaque cookie versus setting a particular value. And as mentioned, because one can't specify a cookie directly but only thru sharing relationships, we need both "to" and "from" to make it completely usable.

So we end up with something like this:
PR_SCHED_CORE_CREATE -- give yourself a unique cookie
PR_SCHED_CORE_CLEAR -- clear your core sched cookie
PR_SCHED_CORE_SHARE_FROM <src_task> -- get their cookie for you
PR_SCHED_CORE_SHARE_TO <dest_task> -- push your cookie to them

An additional question is should the inheritability of a process' cookie be configurable? The current code gives the child process their own unique cookie if the parent had a cookie. That is useful in some cases, but many other configurations could be made much easier with inheritance.

If configurable cookie inheritance would be useful, it might look something like this:

PR_SCHED_CORE_CHILD_INHERIT <0/1> -- 1 - child inherits cookie from parent. 0 - If parent has a cookie, child process gets a unique cookie.

-chrish