Linux Kernel Scheduling Addition Notification : Hybrid Sleepers and Unfair scheduling

From: Mitchell Erblich
Date: Wed Mar 18 2015 - 19:25:23 EST


Please note that this proposal is from this engineer and not from the company he works for.

This SHOULD also fulfills any legal notification of work done, but not submitted to the Linux Kernel.


Transfer of Information : Notification & Proposal of Feasibility to Support System V Release 4 Defacto Standard Scheduling Extensions, etc within Linux
——————————-


SCHED_IA
Over 10 years ago, System V Release 4 was enhanced with additional features by Sun Microsystems. One of the more minor extensions dealt with the subdivision of process’s scheduling characteristics and was known as he INTERACTIVE /IA scheduling class. This scheduling class was targeted to frequent sleepers, with the mouse icon being one the first processes/tasks..

Linux has no explicit SCHED_IA scheduling policy, but does alter scheduling characteristics based on some sleep behavior (example: GENTLE_FAIR_SLEEPERS) within the fair share scheduling configuration option. Processes / tasks that are CPU bound that fit into a SLEEPER behavior can have a hybrid behavior over time where during any one scheduling period, it may consume its variable length allocated time. This can alter its expected short latency to be current / ONPROC. To simplify the implementation, it is suggested that SCHED_IA be a sub scheduling policy of SCHED_NORMAL. Shouldn’t an administrator be able to classify that the NORMAL long term behavior of a task, be one as a FIXED sleeper?

Thus, the first Proposal is to explicitly support the SCHED_IA scheduling policy within the Linux kernel. After kernel support, any application that has the same functionality as priocntl(1) then needs to be altered to support this new scheduling policy.


Note: Administrator in this context should be a task with a UID, EUID, GID, EGID, etc, that has the proper CAPABILITY to alter scheduling behavior.


SCHED_UNFAIR
UNIX / Linux scheduling has in the most part attempts to achieve some level of process / task scheduling fairness within the Linux scheduler using the fair share scheduling class. Exceptions do exist, but are not being discussed below. In general this type of scheduling is acceptable in a generic implementation, but has weaknesses when UNIX / Linux is moved into a different environment. Many companies use UNIX / Linux in heavy networking environments where one or more tasks can infrequently attempt to consume more than its fair share in a window scheduling period.

This proposal is to acknowledge that “nice” and a few other scheduling workarounds do no always suffice to allow this temporary inequality to exist. Yes, a cpumask could be set that allows only certain tasks to run on specific nodes, however the implied assumption is that they only infrequently need to have inequality / greater “time slice” than their fair share. A network protocol task example is a convergence task that needs to run until it is finished and until that happens needed routing changes will not occur. The time window in which all tasks need to be run, SHOULD not need this task in consecutive time windows, thus over longer periods of time, it still fulfills the fair scheduling policy. Again the proper CAPABILITY needs to be specified with the priocntl(1) like application as running many tasks per CPU COULD then effect the performance of the system.

Thus, explicit support for a new SCHED_UNFAIR scheduling policy is proposed / suggested within the Linux kernel. Again it can be a sub scheduling policy of SCHED_NORMAL.

If there is an expressed need / want for this type of functionality to be patched into a git, please inform this engineer if any additional information is to be provided since this minimal TOI document is more architecture / enhancement based and does not deal into any details as to a possible implementation of the above additional functionality.

Thank you,
Mitchell Erblich
UNIX Kernel Engineer--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/