Re: [PATCH v2 RFC 08/13] sched/qos: Add a new sched-qos interface

From: zhidao su

Date: Tue Jun 23 2026 - 06:29:58 EST


On Mon, May 04, 2026 at 02:59:58AM +0100, Qais Yousef wrote:
> Provide a generic and extensible interface to describe arbitrary QoS
> tags to tell the kernel about specific behavior that is doesn't fall
> into the existing sched_attr.
>
> The interface is broken into three parts:
>
> * Type
> * Value
> * Cookie

Hi Qais,

I tested the proposed ABI on an Android device running
6.18.20-android17-3-maybe-dirty-4k with this sched_qos series carried
locally. This was a direct syscall preflight rather than an app-level
benchmark:

- set SCHED_FLAG_QOS / SCHED_QOS_RAMPUP_MULTIPLIER=4 on an existing
task with sched_setattr();
- read it back with sched_getattr() and check type/value;
- repeat with value 0;
- fail the test if sched_setattr() returns 0 but readback does not match.

This exposed two ABI semantics that I think need to be nailed down.

First, a QoS-only sched_setattr() can return success without applying the
QoS state.

In __sched_setscheduler(), the policy-unchanged fast path checks whether
nice, RT priority, DL params, or util-clamp changed before deciding there
is nothing to do. It does not treat SCHED_FLAG_QOS as a change there, so a
call that only changes sched_qos_type/value can return 0 before reaching
__setscheduler_sched_qos().

That makes syscall success ambiguous for a userspace manager: it needs to
read the state back to know whether the QoS request actually took effect.
I think SCHED_FLAG_QOS should be sufficient to force the change path.

Second, sched_getattr() uses sched_qos_type from the userspace sched_attr
buffer as the selector for which QoS value to report:

if (copy_from_user(&kattr.sched_qos_type,
&uattr->sched_qos_type,
sizeof(kattr.sched_qos_type)))
return -EFAULT;

switch (kattr.sched_qos_type) {
case SCHED_QOS_RAMPUP_MULTIPLIER:
kattr.sched_qos_value = p->sched_qos.rampup_multiplier;
kattr.sched_qos_cookie = 0;
break;
default:
break;
}

If this input-selector behavior is intentional, I think it should be
documented as part of the query ABI. Otherwise sched_getattr() looks like
an output-only operation, and helper code can easily confuse "I did not
select a QoS type to query" with "this task has no QoS state".

The same preflight also checked that short sched_attr sizes with
SCHED_FLAG_QOS fail predictably. That part does fail today, but indirectly
because the short copy leaves sched_qos_type as SCHED_QOS_NONE. It may be
clearer to make the required size for SCHED_FLAG_QOS explicit, similar to
the util-clamp size check.

With set/readback checks in place, I can observe a narrow scheduler-layer
signal in a controlled native transition workload. I would not treat that
as a generic Android UI latency result or a Tested-by tag, but the
preflight made the set/query semantics above stand out.

Thanks.