Re: -rt more realtime scheduling issues

From: Steven Rostedt
Date: Wed Oct 10 2007 - 07:51:18 EST


On Tue, Oct 09, 2007 at 11:49:53AM -0700, Mike Kravetz wrote:
> The more I try understand the IPI handling the more confused I get. :(
> At fist I was concerned about an IPI happening in the middle of the
> __schedule routine. But, then it occurred to me that interrupts are
> disabled when in this routine (when holding the runqueue lock). So, IPIs
> are not delivered during __schedule processing. Right?
>
> But, if this is case then I don't understand the following code in
> schedule():
>
> local_irq_disable();
>
> do {
> __schedule();
> } while (unlikely(test_thread_flag(TIF_NEED_RESCHED) ||
> test_thread_flag(TIF_NEED_RESCHED_DELAYED)));
>
> local_irq_enable();
>
> How can the reschedule flags possibly be set AFTER running __schedule.
> Especially when the call is explicitly surrounded by local_irq_disable/
> local_irq_enable.
>
> Can someone help me?
>

Sure, another CPU can set the tasks NEED_RESCHED flag. In try_to_wake_up,
if the process that is waking up is on a runqueue on another CPU and it
is of higher priority than the current running task, the process that is
doing the waking will set the NEED_RESCHED flag for that task.

So to prevent a race where we have called schedule and after getting to
the new running task, a higher priority process just got scheduled in,
we will catch that here.

Now if this is really needed? I don't know. It seems that it just wants
to check here so we don't need to jump to the interrupt and then
schedule while coming back out of the interrupt handler as a preemption
schedule. This way we just schedule again and save a little overhead
from doing that through the interrupt.

But this brings up an interesting point. Since the IRQ handlers are run
as threads, and the interrupt is what will wake them, this seems to add
a bit of latency to interrupts.

For example:

We schedule in process A of prio 1

before exiting __schedule process B is woken up on that same rq
with a prio of 2 and sets A's NEED_RESCHED flag.

Also an interrupt goes off and sent to this CPU. But since interrupts
are disabled, we wait.

leaving __schedule() we see that A's NEED_RESCHED flag is set, so we
continue the do while loop and call __schedule again.

We schedule in B of prio 2.

Leave __schedule as well as the do while loop and then enable
interrupts.

The interrupt that was pending is now triggered.

Wakes up the handler of prio 90 and since it is higher in priority
than process B of prio 2 it sets B's NEED_RESCHED flag.

On return from the interrupt we call schedule again.

This seems strange. I can imagine on a large # of CPUs box that this can
happen quite often, and have the interrupts disabled for several rounds
through schedule.

I say we ax that while loop.

Ingo?

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/