Re: [patch] Real-Time Preemption, deactivate() scheduling issue

From: George Anzinger
Date: Fri Mar 04 2005 - 10:46:52 EST


Eugeny S. Mints wrote:
Esben Nielsen wrote:

As I read the code the driver task (A) should _not_ be removed from the
runqueue. It has to be waken up to call schedule_timeout() such it gets
back on the runqueue after 10 ms. If it is taken out of the runqueue at
line 76 it will stay off the runqueue forever in the TASK_UNINTERRUBTIBLE
state!

Exactly. This is definilty the bug in the driver code - a developer just
didn;t care about proper utilization of set_current_state(). The driver works
just because as you have described - his fortune
that scheduler doesn't remove task in not TASK_RUNNING state from a run queue.
And my main question was - does everybody think it's ok have task in not TASK_RUNNING state in run queue. My current feeling is that this should not be allowed.

This is the normal and specified way to handle this sort of thing. There is a race issue that coding in this way avoids. The coding sequence is:
a) set the task state to some state other than TASK_RUNNING.
b) do what ever triggers the wake up. This may be several things, for example, an interrupt from some device OR a timeout.
c) call schedule to wait.

The race is getting to the schedule call before the wake up happens. If, for some reason, the wake up condition happens prior to the schedule call, it will set the task state back to TASK_RUNNING so that when the schedule() call is made the scheduler will just return which is the right thing (tm) to do as the condition being waited on has happened. We also note that disabling interrupts or preemption will NOT avoid the race unless you disable interrupts on ALL cpus, which is a VERY expensive cross cpu call.

As I read the use PREEMPT_ACTIVE, it is there to test on whether this
rescheduling is voluntary or forced (a preemption). If it is forced the
task shall of course not go off the runqueue but stay there to run again
when it gets the highest priority. That is why PREEMPT_ACTIVE is set in
preempt_schedule() and preempt_schedule_irq(). On the other hand if the
task itself has called schedule() or schedule_timeout() it has to go out
of the runqueue and wait for some event to wake it up.

You right - it works perfectly - but not for my test case - I believe task in not TASK_RUNNING state should be removed from a run queue by the first (any - voluntary or forced) execution of the schedule() which detects the task state is not TASK_RUNNIG.

This would cause the task to loose control prior to its setting up the needed wakeup events.


Yes there will be tasks in state other that TASK_RUNNING on the runqueue.
The "bug" as I see it is in the scheduler interface: There is no way to
set the task state and call schedule() or schedule_timeout() atomicly.
Therefore you can be preempted while the state is not TASK_RUNNING.

Exactly. IMO this interface is weird and needs rework. I don;t understand what the reason to set task state before schedule_timeout() call but not inside, right before the schedule(). The actual task state may be passed as a parameter.

You are assuming that the task ONLY wants to do a timeout. Most of the time the timeout indicates an error condition. The timeout bounds the wait for what is really desired, i.e. a device interrupt, some other task signaling, or some such.

Surly this is covered in the various driver writing guides...
--
George Anzinger george@xxxxxxxxxx
High-res-timers: http://sourceforge.net/projects/high-res-timers/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/