Re: Scheduling Times --- Revisited

Richard Gooch (rgooch@atnf.csiro.au)
Tue, 29 Sep 1998 16:16:59 +1000


Larry McVoy writes:
> I said:
> : > I'd like to start out by saying I feel crappy about my part in this
> : > whole discussion. I'm good at proving I have more to learn. I
> : > don't want to discourage people like Richard - that isn't the right
> : > answer. The right answer is to help him do better work and
> : > encourage better thinking, in a supportive way, and I haven't done
> : > that. So I'm sorry for screwing that up, that's my problem.
>
> Richard Gooch <rgooch@atnf.csiro.au> said:
> : Well, thanks. This sounds more positive. However, I still think I
> : should point out that a phrase like "encourage better thinking" is
> : still not that tactful.
>
> FLAME ON:
>
> Oh, excuse me, Mr Gooch sir. I forgot that your brain is perfect and
> that your code is perfect and that you have absolutely no room for
> improvement whatsoever. How horribly rude of me to ever think that you
> are less than perfect and that anything I could ever say or do could help
> you in any way. Please accept my humble apologies, Mr Gooch sir, and rest
> assured that I'll try and be more careful in the future.
>
> FLAME OFF.
>
> Sheesh.

Well, you've gone off on a tangent here. It's not about my being
perfect (I'm not, and I don't claim to be). I'm just trying to make
you aware that your comment wasn't tactful. It sounds condescending.
I'm just trying to help you by pointing this out.

I'll try to illustrate by rephrasing your sentence in a way that I
think gets the positive side of your message across without having the
potential to offend or antagonise:

"The right answer is to calmly point out areas which I disagree with
and why, and I haven't done that."

Do you see the difference now? Also, bear in mind that because you
have already been abusive and arrogant, you need to work that little
bit harder to show that you really have changed your tune. You may
lament this, but it's a fact of life when interacting with
people. It's harder to wash away bad memories than good ones.

> : A few points here. Firstly, you said a number of times that "real" or
> : "correct" applications don't have a large run queue.
>
> And you've said yourself that you agreed with that, at least for RT
> processes.

That's right. Although I seem to find that in other posts my agreement
on this point keeps being forgotten, so I have to restate it.

> : Secondly, you have asserted that it won't be a problem for "realistic"
> : applications which only have a few processes on the run queue. My
> : tests on a Pentium 100 show that a mere 3 extra processes on the run
> : queue doubles the scheduling/wakeup latency of an RT processes.
>
> Whoop dee do. I can also show that 6 cache misses of any sort will do
> the same thing. So what?

So it's cumulative. Also, as I'll explain in a later email, we have a
RT application that runs on 386's (no, we can't afford to upgrade
them, and besides, a 386 and pSOS+ is fast enough anyway). So variance
due to cache misses is not always a concern.

> : application, but some of our RT applications have threads which run
> : for a very short time (read from blocking device, compute a new value
> : and write, place the recently read value into SHM, unlock a semaphore
> : and read again (block)).
>
> This is mistake #1 made by inexperienced coders in this area. It's
> funny, Mr Gooch, that I've encountered this sort of problem before,
> including in systems that had orders of magnitude more than your
> claimed 50 man years invested in the system, and yet somehow these
> systems managed to get rid of processes that woke up, read one
> thing, wrote one thing, released a lock, and went back to sleep.
>
> Wanna know why they fixed it? Because it is a brain dead way of
> gathering data, and that was immediately obvious to all but the most
> inexperienced programers. When I pointed out that that was their
> problem they were horribly embarrassed and went and fixed it. None
> of them (and I can think of at least 4 different occasions where
> customers were doing this, that doesn't count the internal cases)
> sat around and tried to tell me that the OS needed to fix this
> problem. You are unique in that respect and I'm not sure that's the
> sort of uniqueness one should foster.

So what do you suggest as an alternative? Scenario: 10 kHz interrupt
from device, new value needs to be written within 50 us. Device driver
reads data, wakes up RT process. RT process reads data from driver,
computes and writes new value, writes recent value to SHM and blocks
on read again.
SHM value is read at a much lower rate by low-priority threads. Some
of these can safely lag behind, so they don't even need RT priority.
Running on a 386DX33 where a switch takes 12.3 us (no extra processes
on the run queue) and each extra process on the run queue adds another
7.4 us. Add interrupt latency, work to be done, syscall overheads and
interrupt disabling sections, and we're getting close to 50 us. Add
a few monitor threads (SCHED_OTHER), and we go past that.

Regards,

Richard....

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/