On Tue, Aug 25, 2015 at 07:47:44PM +0800, Wanpeng Li wrote:
On 8/25/15 6:32 PM, Peter Zijlstra wrote:So the normal rules for changing task_struct::cpus_allowed are holding
So Possibly, Maybe (I'm still to wrecked to say for sure), something
like this would work:
WARN_ON(debug_locks && (lockdep_is_held(&p->pi_lock) ||
(p->on_rq && lockdep_is_held(&rq->lock))));
Instead of those two separate lockdep asserts.
Please consider carefully.
both pi_lock and rq->lock, such that holding either stabilizes the mask.
This is so that wakeup can happen without rq->lock and load-balance
without pi_lock.
From this we already get the relaxation that we can omit acquiring
rq->lock if the task is not on the rq, because in that case
load-balancing will not apply to it.
** these are the rules currently tested in do_set_cpus_allowed() **
Now, since __set_cpus_allowed_ptr() uses task_rq_lock() which
unconditionally acquires both locks, we could get away with holding just
rq->lock when on_rq for modification because that'd still exclude
__set_cpus_allowed_ptr(), it would also work against
__kthread_bind_mask() because that assumes !on_rq.
That said, this is all somewhat fragile.
Commit (5e16bbc2f: sched: Streamline the task migration locking a little)Yeah, that's quite disgusting.. also you'll trip over the lockdep_pin if
won't hold the pi_lock in migrate_tasks() path any more, actually pi_lock
was still not held when call select_fallback_rq() and it was held in
__migrate_task() before the commit. Then commit (25834c73f93: sched: Fix a
race between __kthread_bind() and sched_setaffinity()) add a
lockdep_assert_held() in do_set_cpus_allowed(), the bug is triggered. How
about something like below:
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5186,6 +5186,15 @@ static void migrate_tasks(struct rq *dead_rq)
BUG_ON(!next);
next->sched_class->put_prev_task(rq, next);
+ raw_spin_unlock(&rq->lock);
+ raw_spin_lock(&next->pi_lock);
+ raw_spin_lock(&rq->lock);
+ if (!(task_rq(next) == rq && task_on_rq_queued(next))) {
+ raw_spin_unlock(&rq->lock);
+ raw_spin_unlock(&next->pi_lock);
+ continue;
+ }
you were to actually run this.
Now, I don't think dropping rq->lock is quite as disastrous as it
usually is because !cpu_active at this point, which means load-balance
will not interfere, but that too is somewhat fragile.
So we end up with a choice of two fragile.. let me ponder that a wee
bit more.