Re: [PATCH] [request for inclusion] Realtime LSM

From: Con Kolivas
Date: Thu Jan 13 2005 - 21:48:32 EST


utz lehmann wrote:
On Fri, 2005-01-14 at 13:08 +1100, Con Kolivas wrote:

utz lehmann wrote:
Just an idea. What about throttling runaway RT tasks?
If the system spend more than 98% in RT tasks for 5s consider this as a
_fatal error_. Print an error message and throttle RT tasks by inserting
ticks where only SCHED_OTHER tasks allowed. For a limit of 98% this
means one SCHED_OTHER only tick all 50 ticks.

The limit and timeout should be configurable and of course it can be
disabled.

I know this is against RT task preempt all SCHED_OTHER but this is only
for a fatal system state to be able to recover sanely. A locked up
machine is is the worse alternative.

There is a patch in -mm currently designed to use a sysrq key combination which converts all real time tasks to sched normal to save you if you desire in a lockup situation. We do want to preserve RT scheduling behaviour at all times without caveats for privileged users.


The sysrq is already in 2.6.10. I had to use it the last days a few
times. But it does help if you have no access to the console.

The RT throttling idea is not to change the behavior in normal
conditions. It's only for a fatal system state. If you have a runaway RT
task you can't guarantee the system is work properly anyway. It's
blocking vital kernel threads, filesystems, swap, keyboard, ...

I understand fully your concern. If such a thing were to be introduced it would have to be disabled by default. Since I'm looking at implementing such throttling for user RT tasks, it should be trivial to add it to other RT tasks, and have 100% as the default cpu limit. How does that sound?

It's a bit like out of memory. You can do nothing and panic. Or trying
something bad (killing processes) which is hopefully better as the
former.
btw: Are RT tasks excluded by the oom killer?

I haven't looked. VM hackers?

Con

Attachment: signature.asc
Description: OpenPGP digital signature