Re: scheduling

Alexander Kjeldaas (astor@guardian.no)
Fri, 1 May 1998 20:33:30 +0200


On Fri, May 01, 1998 at 07:05:04PM +0200, MOLNAR Ingo wrote:
>
> (*): isnt exactly true, we have (rare) 'recalculation' events which are
> O(nr_tasks). They have a probability of 1/(20*nr_runnable), ie. decreasing
> with more runnable processes and being very small even with 1 runnable
> process). Unfortunately, there is one exception in 2.0.33 (recent 2.1
> kernels are fixed), sched_yield(), it caused counter-recalculation for
> every schedule(). This showed up in certain (valid, really scheduling)
> benchmarks. Again, this was fixed recently in 2.1.
>

This is exactly what I noticed while doing the pidhash patch. The
scheduler is the only place in the kernel, except for three cases I
think (killall, exit while being ptraced, and soon 'remove
capabilities on all processes/set securelevel') that will traverse all
processes. It is thus the only place where we depend on the number of
processes. However, I was not aware of any recent sched_yield() fixes
[it was not fixed in 2.1.90 it seems], nor that the recalculations are
less frequent when the number of processes grow. Considering 2000
processes, that makes for a recalculation fewer than once per 6
minutes. Isn't that _way_ too seldom? I have a feeling that the recent
sched_yield() "fix" might have some bad side-effects on large systems.

If you look at ftp://ftp.guardian.no/pub/free/linux/pidhash.gif , I
have a green line (2nd from the top) that shows the slowest time for a
fork+exit+waitpid. It looks very linear, with only a few exceptions
[the three spikes could be some cache anomality??] and to me, it
seemed logical that these samples plain bad luck with the scheduler
which wanted to do a recalculation. If it happens only each 6th minute
at 2000 processes, I must have been wrong, or the system had some
process calling sched_yield() at the time, making the recalculations
far more common. Note that the recalculation takes 1ms at 2000
processes.

I agree that RL efficiency of the scheduler is most important, but
nevertheless, it would be nice to remove the last "unnecessary"
O(nr_tasks) from the kernel. [I _am_ taking for granted that someone
will figure out how to do recalculations in less than O(nr_tasks) time
;-)].

astor

-- 
 Alexander Kjeldaas, Guardian Networks AS, Trondheim, Norway
 http://www.guardian.no/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu