Re: scheduler nice 19 versus 'idle' behavior / static low-priority scheduling

From: Lennart Sorensen
Date: Mon Feb 02 2009 - 12:24:02 EST

Next message: Rajiv Andrade: "[PATCH 0/2] TPM: refactoring and integrity"
Previous message: Maciej Sosnowski: "[PATCH] dca: redesign locks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Jan 30, 2009 at 12:49:44AM -0500, Nathanael Hoyle wrote:
> All (though perhaps of special interest to a few such as Ingo, Peter,
> and David),
>
> I am posting regarding an issue I have been dealing with recently,
> though this post is not really a request for troubleshooting. Instead
> I'd like to ramble for just a moment about my understanding of the
> current 2.6 scheduler, describe the behavior I'm seeing, and discuss a
> couple of the architectural solutions I've considered, as well as pose
> the question whether anyone else views this as a general-case problem
> worthy of being addressed, or whether this is something that gets
> ignored by and large. It is my hope that this is not too off-topic for
> this group.
>
> First, let me explain the issue I encountered. I am running a relatively
> powerful system for a home desktop, an Intel Core 2 Quad Q9450 with 4 GB
> of RAM. If it matters for the discussion, it also has 4 drives in an
> mdraid raid-5 array, and decent I/O throughput. In normal circumstances
> it is quite responsive as a desktop (kde 3.5.4 atm). It is further a
> very carefully configured kernel build, including only those things
> which I truly need, and excluding everything else. I often use it to
> watch DVD movies, and have had no trouble with performance in general.
>
> Recently I installed the Folding@Home client, which many of you may be
> familiar with, intended to utilize spare CPU cycles to perform protein
> folding simulations in order to further medical research. It is not a
> multi-threaded client at this point, so it simply runs four instances on
> my system, since it has four cores. It is configured to run at
> nice-level 19.

I too have seen this behaviour on my quad core Q6600 mythtv box, and I
too run folding@home on it and have a 4 drive raid5.

> Because it is heavily optimized, and needs little external data to
> perform its work, it spends almost all of its time cpu-bound, with
> little to no io-wait or blocking on network calls, etc. I had been
> using it for about a week with no real difficulty until I went to watch
> another DVD and found that the video was slightly stuttery/jerky so long
> as foldingathome was running in the background. Once I shut it down,
> the video playback resumed its normal smooth form.
>
> There are a couple simple solutions to this:
>
> Substantially boosting the process priority of the mplayer process also
> returns the video to smooth playback, but this is undesirable in that it
> requires manual intervention each time, and root privileges. It fails to
> achieve what I want, which is for the foldingathome computation to not
> interfere with anything else I may try to do. I want my compiles to be
> as *exactly* as fast as they were without it as possible, etc.
>
> Stopping foldingathome before I do something performance sensitive is
> also possible, but again smacks of workaround rather than solution. The
> scheduler should be able to resolve the goal without me stopping the
> other work.
>
> I have done a bit of research on how the kernel scheduler works, and why
> I am seeing this behavior. I had previously, apparently ignorantly,
> equated 'nice 19' with being akin to Microsoft Windows' 'idle' thread
> priority, and assumed it would never steal CPU cycles from a process
> with a higher(lower, depending on nomenclature) priority.
>
> It is my current understanding that when mplayer is running (also
> typically CPU bound, occassionally it becomes I/O bound briefly), one of
> the instances of foldingathome, which is sharing the CPU (core) with
> mplayer starts getting starved, and the scheduler dynamically rewards it
> with up to four additional priority levels based on the time remaining
> in its quantum which it was not allowed to execute for.
>
> At this point, when mplayer blocks for just a moment, say to page in the
> data for the next video frame, foldingathome gets scheduled again, and
> gets to run for at least MIN_TIMESLICE (plus, due to the lack of kernel
> pre-emptibility, possibly longer). It appears that it takes too long to
> switch back to mplayer and the result is the stuttering picture I
> observe.
>
> I have tried adjusting CONFIG_HZ_xxx from 300 (where I had it) to 1000,
> and noted some improvement, but not complete remedy.
>
> In my prior searching on this, I found only one poster with the same
> essential problem (from 2004, and regarding distributed.net in the
> background, which is essentially the same problem). The only technical
> answer given him was to perhaps try tuning the MIN_TIMESLICE value
> downward. It is my understanding that this parameter is relatively
> important in order to avoid cache thrashing, and I do not wish to alter
> it and have not so far.
>
> Given all of the above, I am unconvinced that I see a good overall
> solution. However, one thing that seems to me a glaring weakness of the
> scheduler is that only realtime priority threads can be given static
> priorities. What I really want for foldingathome, and similar tasks, is
> static, low priority. Something that would not boost up, no matter how
> well behaved it was or how much it had been starved, or how close to the
> same memory segments the needed code was.
>
> I think that there are probably (at least) three approaches here. One I
> consider unnacceptable at the outset, which is to alter the semantics of
> nice 19 such that it does not boost. Since this would break existing
> assumptions and code, I do not think it is feasible.
>
> Secondly, one could add additional nice levels which would correspond to
> new static priorities below the bottom of the current user ones. This
> should not interfere with the O(1) scheduler implementation as I
> understand it, because current I believe 5 32-bit words are used to flag
> the queue usage, and 140 priorities leaves 20 more bits available for
> new priorities. This has its own problems however, in that existing
> tools which examine process priorities could break on priorities outside
> the known 'nice' range of -20 to 19.
>
> Finally, new scheduling classes could be introduced, together with new
> system calls so that applications could select a different scheduling
> class at startup. In this way, applications could volunteer to use a
> scheduling class which never received dynamic 'reward' boosts that would
> raise their priorities. I believe Solaris has done this since Solaris
> 9, with the 'FX' scheduling class.
>
> Stepping back:
>
> 1) Is my problem 'expected' based on others' understanding of the
> current design of the scheduler, or do I have a one-off problem to
> troubleshoot here?
>
> 2) Am I overlooking obvious alternative (but clean) fixes?
>
> 3) Does anyone else see the need for static, but low process priorities?
>
> 4) What is the view of introducing a new scheduler class to handle this?
>
> I welcome any further feedback on this. I will try to follow replies
> on-list, but would appreciate being CC'd off-list as well. Please make
> the obvious substitution to my email address in order to bypass the
> spam-killer.

Well I haven't looked into it myself, but I can certainly confirm that
the current bahaviour is downright awful with this particular mix of
processes.

--
Len Sorensen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Rajiv Andrade: "[PATCH 0/2] TPM: refactoring and integrity"
Previous message: Maciej Sosnowski: "[PATCH] dca: redesign locks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]