sched: Where to queue RT tasks on prio drop
From: Grochowalski, Matthew (GE Aviation, US)
Date: Wed May 04 2016 - 18:03:52 EST
It looks like commit 81a44c5 (sched: Queue RT tasks to head when prio drop) made the behavior on dropping (userspace view) more sensible but I believe the behavior is still incorrect according to POSIX.
POSIX (in volume 2 section 2.8.4 Process Scheduling) specifies two different semantics for where the task is placed in the thread list for the new priority
8. If a thread whose policy or priority has been modified by pthread_setschedprio() is a running thread or is runnable, the effect on its position in the thread list depends on the direction of the modification, as follows:
a. If the priority is raised, the thread becomes the tail of the thread list.
b. If the priority is unchanged, the thread does not change position in the thread list.
c. If the priority is lowered, the thread becomes the head of the thread list.
7. If a thread whose policy or priority has been modified other than by pthread_setschedprio() is a running thread or is runnable, it then becomes the tail of the thread list for its new priority.
Commit 81a44c5 made all of the priority change functions behave according to the pthread_setschedprio semantics.
It appears commit ff77e46 (sched/rt: Fix PI handling vs. sched_setscheduler()) causes changing a task's priority to its existing priority to requeue it at the tail.
So a task settings its own priority to its current priority would be the same as a sched_yield().
I believe the correct behavior is to have the existing priority change syscalls (sched_setscheduler and sched_setparam) always move the changed task to the back of the queue for the new priority.
But as far as I can tell the kernel provides no way to implement pthread_setschedprio with the correct semantics.
It seems the best way to implement this would be adding a flag (SCHED_SETSCHEDPRIO) to the existing sched_setattr syscall.
Any thoughts?
Thanks,
--Matt Grochowalski