Re: [PATCH v6 2/3] sched/rt: Fix wrong SMP scheduler behavior for equal prio cases

From: Peter Zijlstra
Date: Tue Apr 28 2015 - 06:19:32 EST

On Sun, Apr 26, 2015 at 11:58:51AM -0400, Steven Rostedt wrote:
> I think what Xunlei is trying to say, is that we don't currently keep
> FIFO when preemption or migration is involved. If a task is currently
> running, strict FIFO denotes that it should run ahead of all other
> tasks queued at its priority or less until it decides to schedule out.
> But the issue is, if it gets preempted or migrates, it gets placed
> behind other tasks of the same priority as itself, but it never
> voluntarily relinquished the CPU.

So 1) FIFO is only defined for UP, anything SMP is well outside of the
FIFO spec and therefore we cannot break it.

2) The 'head' of the queue only has meaning on UP, with SMP there's 'n'
heads, which of those heads is is the foremost head? That is, we're
already lost order, you cannot reconstruct. This cannot be done without
first defining order and then implementing that.

The Changelog is completely devoid of useful information.

> Thus, if it gets preempted by a higher priority task, it should at a
> minimum be placed ahead of all other tasks of its priority or less to
> run on the CPU again.

Which, with the current status is an impossibility with the exception of

> If it gets migrated to another CPU, it should at
> least be placed ahead of other tasks on that new CPU of the same
> priority.

Who says this task is further 'ahead' than the head on the new CPU?

> Although, for the migration case, I'm not sure why it would
> be migrated to a CPU where it couldn't run right away in the first
> place, as the push/pull logic only migrates RT tasks that can run on
> the new CPU. Unless, he's talking about a race where a new task just
> got scheduled before it made it to the CPU? But that's a separate issue.

See, even you don't really know wtf he's wanting to do.

> But at least for being preempted by a higher priority task, it should
> be placed back ahead of the currently running tasks, unless it did a
> yield or is RR and its time ran out.

AFAICT this is already so. For RT we do not dequeue running tasks (CFS
does), so pick_next_task/put_prev_task does not change the location of a
task on the queues.

> I'm not sure why your solution with yield_task_rt() and task_tick_rt()
> doesn't work. Maybe Xunlei is looking too deep into the solution.
> Monday, I'll try to spend some time looking at the scheduler logic
> there.

No, have Xunlei go write a coherent problem statement, and for as long
as you don't understand it send him back to it.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at