Re: [PATCH v2 RFC 08/13] sched/qos: Add a new sched-qos interface
From: Chen, Yu C
Date: Thu May 07 2026 - 10:25:25 EST
On 5/7/2026 5:55 PM, Qais Yousef wrote:
On 05/06/26 13:38, Tim Chen wrote:
On Mon, 2026-05-04 at 02:59 +0100, Qais Yousef wrote:
[ ... ]
The idea is that the cookie is per QOS per process. So QOS_TYPE_A would have
its unique cookie range, and QOS_TYPE_B would have its independent unique
cookie range. To allow flexibility and extensibilty to describe independent
behavior that requires independent grouping.
From a user point of view, I can think of the following use cases for fine-grained
cache-aware scheduling:
u1. A user wants to enable or disable cache-aware scheduling for all
threads of a process. (No extra tagging is needed.)
u2. A user wants to enable or disable cache-aware scheduling for all
tasks within a cgroup. (No extra tagging is needed.) Vern from
Tencent was advocating for this model.
u3. A user wants to enable or disable cache-aware scheduling for an
arbitrary set of tasks. (Userspace tagging is required.)
If I understand correctly, u3 is exactly the use case where schedqos
cookie can help. Under your design, we cannot tag an arbitrary set of
tasks with the same cookie; we are only allowed to assign the same cookie
to threads within the same process and under the same QoS type. So
this might eliminate the case where different processes share data
with each other that we want to aggregate(NUMA balancing's numa_group
is an indicator of tasks sharing data)
We probably need a sched_qos_cookie structure defined analogous to
the sched_core_cookie to anchor the tasks. And sched_qos_cookie could be a ptr value
to sched_qos_cookie, as in sched_core_cookie instead of it being a __u32
as in the patch below.
As part of the API or internal implementation detail? I think we do need
a cookie structure that stores the sched_qos_type and sched_qos_cookie tuple
internally as implementation detail. But not expose it as an interface.
Yes, I think Tim was referring to the internal implementation. We need
a pointer to link tasks to their shared sched_qos_cookie.
I think the cookie values should be userspace managed. From experience, this
has to be done in a centralized way via a service otherwise you'd end up with
a mess. There has to be an all knowledgeable entity managing things, which is
what I am proposing in schedqos service. That's why the whole QOS now is
protected with CAP_NICE capability - which I forgot to mention this change from
v1.
Not sure why we do not leverage the OS to allocate and manage cookies.
The OS has full visibility of system-wide information and can maintain globally
unique cookies. Users only need to request the OS to allocate, attach, or detach
tasks to an existing group without supplying an explicit cookie value.
One possible reason I can think of: since the schedqos cookie is defined per QoS
type and per process,it may be more convenient to manage it entirely within the
schedqos service?
thanks,
Chenyu