Re: [PATCH] rtmutex: multiple candidate owners without unrelatedboosting
From: Steven Rostedt
Date: Tue Dec 14 2010 - 23:16:57 EST
On Wed, 2010-12-15 at 11:41 +0800, Lai Jiangshan wrote:
> But this may lead to livelock if we do deprive candidate ownership
> as I mentioned (the mail I reply to Steven)
But I don't really agree with this livelock situation. It seems very
contrived, and as I said in my reply. If something is causing all these
tasks to have their processes boosted (and probably unboosted) there's
much bigger issues at stake here.
>
> Any random on the fly task is a reasonable task for the owner in this
> patch.
Which I disagree with. The highest priority process should be the only
one that gets it.
>
> A(assigned the first candidate ownership), B, C, D, E... are in the
> waitlist. But before new owner really take the lock, B, D 's priority
> are raised and they became the top waiter at least once,
> (D is the top at the end, D, B, A, C, E....) B and D are also
> candidate owners. In my patch, A, B, D have the chance to compete
> the lock, they are all reasonable because they are candidate owners,
> including A!
Only the highest prio process in that list should be a candidate.
>
> OK, let's consider the same case but without this patch.
> A is assigned the pending ownership. B, C, D, E... are in the
> waitlist, But before new owner really take the lock, B, D 's priority
> are raised. In the end, only A can complete the lock, B and D
> just do boost A and help A. only A!
I agree this is wrong, and we welcome the change. But I disagree that if
we wake up both B and D that either one has the chance to get that lock.
If we wake up B and then D gets boosted to a higher priority than B,
then D needs to get that lock. Period! I think you pointed out correctly
that the current code does not do that, but lets not make a "race to the
finish" to get that lock.
>
> Summary, the reason I will give the lock to some random on the fly task
> which is one of the candidate owners:
> 1) avoid livelock
Which I do not see exist.
> 2) no worse than the code without this patch.
Not an excuse. I think you brought up a good point that the current code
has a poor design in the way it handles a waiting process getting
boosted, it should have fair game to the new task. But I also believe
that if we woke this process up, and its higher prio than the current
pending owner, than it should get the lock. It only makes sense in a RT
environment where fair is not fair, or better said... fair is fair
otherwise (FIFO ;-)
>
> >
> >> + top_waiter = rt_mutex_top_waiter(lock);
> >> + top_waiter->cand_seq = lock->cand_seq;
> >> + if (!top_waiter->cand_owner) {
> >> + top_waiter->cand_owner = 1;
> >> + wake_up_process(top_waiter->task);
> >> + }
> >> + }
> >> + raw_spin_unlock(&lock->wait_lock);
> >> + goto out_put_task;
> >
> > If I understand correctly, then this is the equivalent to the
> > original breakout based on !task->pi_blocked_on, right ?
>
> Correct, original breakout when !pending_owner_task->pi_bocked_on,
> but candidate owners are still in the waitlist and ->pi_bocked_on
> still pointers to its waiter. Now we breakout when the lock
> has no owner(but has candidate owner(s) on the fly, we don't boost it).
>
> >
> >> + }
> >> put_task_struct(task);
> >
> >> */
> >> -static int try_to_take_rt_mutex(struct rt_mutex *lock)
> >> +static int try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task,
> >> + struct rt_mutex_waiter *waiter)
> >> {
> >> /*
> >> * We have to be careful here if the atomic speedups are
> >> @@ -390,15 +335,73 @@ static int try_to_take_rt_mutex(struct rt_mutex *lock)
> >> */
> >> mark_rt_mutex_waiters(lock);
> >>
> >> - if (rt_mutex_owner(lock) && !try_to_steal_lock(lock, current))
> >> + if (rt_mutex_owner(lock))
> >> return 0;
> >>
> >> + /*
> >> + * It is a queued waiter.
> >> + *
> >> + * Is it a candidate owner? If it is, it will win unconditionally.
> >> + * And it defeats the other candidate owner(s) (if any) and
> >> + * steal lock from them.
> >
> > Why ? If the boost code changes the top waiter and wakes up the new
> > candidate, then this new one really should get the lock and the old on
> > the fly candidate should queue itself back.
>
> See above, all candidate owner have the chance to compete the lock.
> It is required for avoiding the livelock.
There is no livelock!
>
> If we have other way to avoiding the livelock, we will use only
> one candidate owner, you, Steven and I will very happy.
Explain it again. What you said was:
"In rare condition, B,C 's priority are changed frequent", which is not
something that happens at all. If it does, we have much bigger problems
at stake.
I think your "livelock" scenario is contrived. And its not a livelock
because you will eventual run out of prios to keep boosting B and C.
>
> >
> >> + */
> >> + if (waiter) {
> >> + /* candidate owner? */
> >> + if (waiter->cand_owner && waiter->cand_seq == lock->cand_seq)
> >> + goto get_lock;
> >> +
> >> + /*
> >> + * top waiter must be a candidate owner.
> >> + * But checking it again is not a bad thing.
> >> + */
> >> + if (waiter == rt_mutex_top_waiter(lock))
> >> + goto get_lock;
> >
> > Huch ? This should never happen and therefor this should be a
> > WARN_ON_ONCE(). We really do not put "checking is not a bad thing"
> > code in such a fragile and complex construct. Something is very wrong
> > when you ever hit this.
> >
> >> + }
> >> +
> >> + /*
> >> + * Does it defeat the candidate owner(s) and steal lock from them?
> >> + *
> >> + * Note: in the rare case of a task is boosted but its waiter has not
> >> + * been requeued in the lock's wait list yet, thus it can also steal
> >> + * the lock.
> >
> > How should this happen with the new code ? The waiter remains in the
> > wait list and boosting does not change that. So that comment is
> > misleading.
>
> When someone boosted current task p, but it fail to take the p->pi_block_on->lock->wait_lock,
Nothing boosts p until it has both p->pi_lock and
p->pi_blocked_on->lock->wait_lock, so this scenario does not exist.
> and current task p success take it and call try_to_take_rt_mutex()
> before p->pi_block_on is requeued in the waitlist.
> current task should also have chance to win the lock even it is
> not candidate owner/top waiter. It will win when it has the higher
> priority than the top waiter.
I don't see how this can happen with the way the locks are taken.
>
>
> >
> >> + */
> >> + if (rt_mutex_has_waiters(lock)) {
> >> + if (task->prio >= rt_mutex_top_waiter(lock)->list_entry.prio)
> >> + return 0;
> >> + }
> >> +
> >> +get_lock:
> >> + if (waiter || rt_mutex_has_waiters(lock)) {
> >> + unsigned long flags;
> >> + struct rt_mutex_waiter *top;
> >> +
> >> + raw_spin_lock_irqsave(&task->pi_lock, flags);
> >> +
> >> + /* remove the queued waiter. */
> >> + if (waiter) {
> >> + plist_del(&waiter->list_entry, &lock->wait_list);
> >> + task->pi_blocked_on = NULL;
> >> + }
> >> +
> >> + /*
> >> + * We have to enqueue the top waiter(if have) into
> >> + * task->pi_waiters list and would get boost from it.
> >> + */
> >> + if (rt_mutex_has_waiters(lock)) {
> >> + top = rt_mutex_top_waiter(lock);
> >> + top->pi_list_entry.prio = top->list_entry.prio;
> >> + plist_add(&top->pi_list_entry, &task->pi_waiters);
> >> + __rt_mutex_adjust_prio(task);
> >
> > Why should we boost ? Because we can have several candidates on the
> > fly and we don't care which one is the highest prio candidate? That
> > looks wrong.
>
> A really owner should (be boosted)/(prepare to be booted) by the waiters.
> such code is also needed in original code.
No one could have boosted any of the waiters because we have all the
waiters blocked_on->lock->wait_list lock.
Thomas is correct.
>
> >
> >> + }
> >> + raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> >
> >> @@ -424,6 +427,7 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
> >> __rt_mutex_adjust_prio(task);
> >> waiter->task = task;
> >> waiter->lock = lock;
> >> + waiter->cand_owner = 0;
> >
> > This should init waiter->cand_seq to RTMUTEX_CAND_SEQ_MAX and get rid
> > of cand_owner.
>
> good! thanks.
As I said in another email, I think we should keep cand_owner (rename it
though) and get rid of cand_seq. Looks to me that we can simply use the
top_waiter to decide who gets the lock.
-- Steve
>
> >
> >> + waiter->cand_seq = ++lock->cand_seq;
> >
> > If we make cand_seq an indicator then we need to do
> >
> > lock->cand_seq = (lock->cand_seq + 1) & ~RTMUTEX_CAND_SEQ_MAX;
> >
> >> @@ -1110,12 +1056,11 @@ int rt_mutex_finish_proxy_lock(struct rt_mutex *lock,
> >>
> >> set_current_state(TASK_INTERRUPTIBLE);
> >>
> >> - ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter,
> >> - detect_deadlock);
> >
> > Hmm, so we loose the deadlock detection in that case. Not sure whether
> > it matters though.
> >
> > All in all I like the idea and I think we should give it a try, but I
> > definitely want to test a patch against -RT, as this will shake out
> > all problems in no time.
> >
>
> Thanks,
> Lai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/