Re: Race conditions galore (2.0.33 and possibly 2.1.x)

Bill Hawes (whawes@star.net)
Mon, 22 Dec 1997 12:20:14 -0500


Stephen R. van den Berg wrote:
> This I'm not certain about. I only performed a cursory check of schedule.c,
> but it seems that when the task is still on the run-queue (which it is,
> before it hits schedule()), the task state will *not* be set back to
> TASK_RUNNING.

Setting the task back to TASK_RUNNING would occur in wakeup, if it's
called from an interrupt.

> Also, what happens if the same buffer is:
> 1. Locked.
> 2. We add ourselves to the wait queue.
> 3. The buffer is unlocked.
> 4. We are set to TASK_RUNNING.
> 5. Someone else locks the buffer.
> 6. We set ourselves to TASK_UNINTERRUPTIBLE.
> 7. We check the lock.
> 8. We drop dead in schedule().

Buffers are locked from a running task, never from an interrupt, so
there's no opportunity for another task to lock it while we're running.
But in any event, if a buffer is locked, it has to unlock at some point,
and when it unlocks all of the tasks on the wait list will be set to
TASK_RUNNING. So if we're on the wait queue and the buffer is currently
locked, it's safe to call schedule().

> Note: I wouldn't be beating around about this so much, if it weren't
> for the fact that I have a machine here that actually *repeatedly*
> had processes hanging in constructs like this. I patched it as described,
> and the problems have not recurred yet (maybe that's just a coincidence,
> or maybe I even did something I shouldn't have been doing in the first
> place).

If you are seeing some kind of a hanging problem, perhaps you could add
some printks to isolate the conditions leading to it. There may very
well be bugs somewhere, and if a buffer is failing to unlock the cause
needs to be tracked down.

Regards,
Bill