Re: [PATCH] sched/core: sched_getattr returning consistent sched_priority

From: Peter Zijlstra
Date: Tue Dec 19 2017 - 11:16:55 EST


On Tue, Dec 19, 2017 at 03:41:04PM +0000, Alessio Balsini wrote:
> Always initialise the sched_priority field of the sched_attr struct
> returned by sched_getattr().
> The sched_getattr() syscall takes care of returning a sched_attr
> structure updated with the current scheduling "attributes" of the
> requested thread. This syscall function is dual to sched_setattr() that,
> instead, assigns the given values to the specified thread.
>
> sched_setattr(), as described in the documentation, imposes that
> whenever a thread switches to any non-RT scheduling policy, rt_priority
> must be 0. This check is performed by __sched_setscheduler() which, in
> case of a negative result, ignores the request and returns -EINVAL (*).
> Thus, when calling sched_getattr(), the user would expect that non-RT
> threads will have sched_priority equal to 0 and RT threads 1 <=
> sched_priority <= 99.
> As a result, the sched_priority field should always be specified by
> sched_getattr() with the rt_priority of the thread, whose value is
> coherent thanks to (*).
>
> Since the RT policy check is dropped, the condition to update sched_nice
> is made explicit with the introduced task_has_fair_policy().
> The parameters associated with FAIR and DL tasks can be inconsistent for
> the non-corresponding scheduling classes, and this behaviour, left
> unchanged, is correct since it does not violate the documentation.
>
> Moreover, __getparam_dl(), the function that takes care of filling the
> the sched_attr parameters associated with DL tasks, updates also
> sched_priority. Here, the sched_priority field is out of scope and is
> removed.
> This inaccuracy was introduced in 06a76fe08d4d, that moved the function
> from core.c to deadline.c. Before that, it was making more sense to
> access sched_priority, either if the function name __getparam_dl() was
> misleading.

OK, I'm dense, what?

> sched_setattr(), as described in the documentation, imposes that
> whenever a thread switches to any non-RT scheduling policy, rt_priority
> must be 0.

That is about all I got.


> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 2473736..0ad9cfd 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2502,7 +2502,6 @@ void __getparam_dl(struct task_struct *p, struct sched_attr *attr)
> {
> struct sched_dl_entity *dl_se = &p->dl;
>
> - attr->sched_priority = p->rt_priority;
> attr->sched_runtime = dl_se->dl_runtime;
> attr->sched_deadline = dl_se->dl_deadline;
> attr->sched_period = dl_se->dl_period;

That seems sane.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 75554f3..174d611 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4606,11 +4606,10 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
> attr.sched_policy = p->policy;
> if (p->sched_reset_on_fork)
> attr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
> + attr.sched_priority = p->rt_priority;
> if (task_has_dl_policy(p))
> __getparam_dl(p, &attr);
> - else if (task_has_rt_policy(p))
> - attr.sched_priority = p->rt_priority;
> - else
> + else if (task_has_fair_policy(p))
> attr.sched_nice = task_nice(p);
>
> rcu_read_unlock();

This is confusing, why unconditionally assign? We initialize to 0, and
if it must be 0 for !RT, then we should only assign when rt.