Re: priority inversion

yodaiken@chelm.cs.nmt.edu
Fri, 30 Jul 1999 09:01:17 -0600


On Fri, Jul 30, 1999 at 04:14:21PM +0200, Bernd Paysan wrote:
> > In the common case, Linux processes acquire kernel locks in 4 or 5
> > instructions, not O(1). What is the overriding benefit we obtain by
> > replacing that with such a mess?
>
> The common case (lock is free and can be obtained quickly) is not even in
> discussion here.

No? But you want to eliminate all spin_locks. Even if we ignore that:

P0 obtains a lock A in 5 instructions and sleeps.
P1 obtains a lock B in 5 instructions and blocks on A
P2 blocks on B

If P2 is RT then P0 must be promoted. For P2 to be able to promote P0 there must be some
way of finding P0 and making sure P0 does not wake up and release the lock before it is
promoted. We also have to make sure P0 unpromotes itsself on lock release.
Here's one error case that those 5 instructions must prevent
P2 puts itself on the queue for B
P2 looks at P1 state, promotes P1, and determines P1 is waiting on A
P2 looks at A and determines A is locked by P0
P0 restarts on a second processor and releases A
P2 promotes the priority of P0, needlessly.
P1 is running on a second processor and unlocks B
P2 sleeps on B. Now P0 and P1 have the wrong priority indefinitely, P2 has missed a wakeup ...

Oh oh. We need a bunch of locks here to keep our locking algorithm from screwing up.

Or consider a simpler case:
P0 gets lock A and sets A->owner=P0
P1 starts to lock A on the second processor
P1 sees A is locked and reads A->owner == P0
P0 releases A
P2 locks A and sets A->owner=P2
P0 promotes P0 incorrectly
So in those 5 instructions, we need a reader lock on the owner field ...

We haven't yet touched on the topic of releases. Instead of clear_bit, we now
need to figure out whether we were promoted while holding the lock, and what our new
priority should be.

I can't figure out how to do all that in 5 instructions.

>It's the case that the lock is already obtained by another,
> lower priority process that isn't scheduled again (to release the lock)
> because of another higher priority process that is runnable. This is the
> "priority inversion" we are talking about.

I may be wrong, but I have heard of the problem. Thanks.

> Just looking on the common place will give you a formula one car, even
> when you have to go off-road once in a while. The fun here is that the common
> case is not affected at all.

See above.

>It's the case where schedule() is to be called
> that is affected, and actually, an O(1) scheduler improves that one, too.

O(1) is a dangerous measure. 2^{1000} is O(1). There is a reason why UNIXs have
traditionally not followed the N priority queues model of process scheduling. And it
is not because priority queues are esoteric knowledge.

> Priority inversion only gets messy when you use the current scheduler
> approach (compute goodness of all runnable processes, and select best).

If you can't read postscript, I can send you PDF.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/