Ah the irony... My patch for 2.1.131 changed it to 15, but Linus chose
10 :-)) Lucky guess!
I must read the new scheduling code and try to figure out the effects of
smp_reschedule(). I wasn't aware that it removed the interactive
problem. Also note that I changed it from *20* to 15, so don't be
completely sure you've killed the interactive problem unless it's gone
when you build a kernel with it set to 20. Basically, due to the nature
of the problem, even reducing it to 19 was enough to prevent it hurting
interactive response time.
>
> > swaps CPU *every* timeslice on an x86, big deal - how long can it
> > possibly take on say a PII-512kB cache to refill the L2 cache
> > completely? A damn small percentage of 200ms I'll bet. [...]
>
> takes about 5 msecs on a Xeon with 1M cache, 2-3 usecs on a PII with 512M
> cache.
>
> > My guess is more like 5ms and that's being really generous. Older CPU's
> > have smaller L2 caches and probably shared ones anyway so it won't be
> > very important for them either.
Sounds fine.
>
> you are missing the point. threads switching between CPUs are quite
> common. If a thread is CPU-bound and it's the only thread then your
> example is the _best case_, not the worst case. The worst case is a
> webserver with spikes of load, having say 10-20 processes running at once,
> and heavy rescheduling between say 200 processes. Not accounting for cache
> affinity properly will kill performance in lots of CPU-bound cases.
For me this simply underlines the fact that the current affinity scheme
is badly broken.
We also may have a different idea about what cpu-bound means - 10-20
processes on the runq, sure, but heavy rescheduling between 200
processes sounds a little I/O related to me.
> also, these 'only 5 msecs' are 2.5% of the CPU (we've done full kernel
> rewrites for substantially less performance ..), for _both_ CPUs in
I never realised we cared so much about 2% ;-)) Also, don't forget that
those 5ms are for a Xeon-1M, and as you said it's 2.5% for a PII. And,
don't forget that the 5ms is *worst* case. If you really expect to
completely trash the L2 cache then maybe the application in question
should be run in round-robin mode (Rik's patch if it's not in
circulation yet).
> question. Also, in those 5 msecs we occupy the memory bus, which is a
> scarce resource as well.
>
> > then suddenly you may have a much larger portion of your timeslice
> > wasted with cache misses... But upping PROC_CHANGE_PENALTY probably
>
> but the whol point is moot i think, because 2.2.1 _has_ a high
>
Well yes it's high but the original poster was comparing 2.0 with 2.2 -
2.2.1 has a LOWER PROC_CHANGE_PENALTY than 2.0, so it's not out of the
question that he's on the borderline with 2.0 and just over the edge
with 2.2.1.
cheers
Neil
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/