Re: Interesting scheduling times - NOT

Jamie Lokier (lkd@tantalophile.demon.co.uk)
Fri, 25 Sep 1998 11:09:39 +0100

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Jamie Lokier: "Re: patch: serial 16550C autoflow"
Previous message: Jan Kara: "Question about ext2"

On Fri, Sep 25, 1998 at 11:01:46AM +1000, Richard Gooch wrote:
> The application I monitored (up to 10 processes on the run queue) is
> part of the latest generation astronomical data reduction software
> currently being developed by an international consortium of radio
> astronomy institutions. This software is highly modular and has a
> central co-ordination process which communicates with "agents"
> (processes) to perform most of the work.
> IIRC over 50 man-years has already been put into this project. It is
> far too late to change something this fundamental to the design.
> BTW: it's not *my* design. I'm just telling you what is out there.

And indeed, there are other similar applications in physics which are
similarly questioning Linux' context switching times.

A project I work with has all but dismissed the kernel context switching
and interrupt latency & overhead as far too slow, and is looking into
doing the lot with user-space context switching, user-space polling to
handle some data gathering devices and a device on the PCI bus to write
to user space memory to help with this.

If it's cache issues, that work will be silly as it'll just complicate
the user space code and there won't be any gain.

If it's really kernel problems, we need to know!

BTW, I typically get 5% variation in a simple single CPU bound test of a
splay tree code. It's a rather odd distribution. It tends to take x
seconds (mostly) or x*0.95 seconds (sometimes), very little (but some)
in between. And obviously there's a few larger ones when the system
does something else in the middle of the test. That must surely be due
to cache effects, and L2 at that.

Richard, I just had a thought. Maybe some of your variance is due to
the state of pseudo-LRU in the L1 cache? Intel once told me that, in
general, there's no way an application can force a known configuration
within a cache line, except by flushing the line. This is because their
variation on pseudo-LRU won't guarantee to push out more than 3 ways
within a line (on MMX, where there are 4 ways).

> > Come on, Richard. Do you want there to be no standards for kernel
> > hackers? What do you suggest we do when people show up with no
> > experience and want to check in their favorite thing to the kernel?
> > I'm sorry, but the answer isn't "that's nice, have fun". If you
> > can't stand the heat...

Larry, just have more faith in Linus. If Richard's code is crap, it'll
be rejected. If it makes the scheduler simpler by grouping the RT
special cases together, and fixes some bugs, and Richard's happy with
it, and Linus is happy with it, where's the harm? Even if Richard's
variances do turn out to be an artefact.

-- Jamie

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

Next message: Jamie Lokier: "Re: patch: serial 16550C autoflow"
Previous message: Jan Kara: "Question about ext2"