Re: [tip:sched/core] sched/numa: Introduce migrate_swap()

From: Rik van Riel
Date: Thu Oct 10 2013 - 15:05:47 EST


On 10/10/2013 02:17 PM, Peter Zijlstra wrote:
On Wed, Oct 09, 2013 at 10:30:13AM -0700, tip-bot for Peter Zijlstra wrote:
sched/numa: Introduce migrate_swap()

Thanks to Rik for writing the Changelog!

---
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Subject: sched: Fix race in migrate_swap_stop

There is a subtle race in migrate_swap, when task P, on CPU A, decides to swap
places with task T, on CPU B.

Task P:
- call migrate_swap
Task T:
- go to sleep, removing itself from the runqueue
Task P:
- double lock the runqueues on CPU A & B
Task T:
- get woken up, place itself on the runqueue of CPU C
Task P:
- see that task T is on a runqueue, and pretend to remove it
from the runqueue on CPU B

Now CPUs B & C both have corrupted scheduler data structures.

This patch fixes it, by holding the pi_lock for both of the tasks
involved in the migrate swap. This prevents task T from waking up,
and placing itself onto another runqueue, until after migrate_swap
has released all locks.

This means that, when migrate_swap checks, task T will be either
on the runqueue where it was originally seen, or not on any
runqueue at all. Migrate_swap deals correctly with of those cases.

Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Tested-by: Joe Mario <jmario@xxxxxxxxxx>

Reviewed-by: Rik van Riel <riel@xxxxxxxxxx>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/