Re: [RFC][PATCH RT 0/3] RT: Fix trylock deadlock without msleep() hack

From: Ingo Molnar
Date: Tue Sep 08 2015 - 03:31:27 EST



* Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> 3) sched_yield() makes me shudder
>
> CPU0 CPU1
>
> taskA
> lock(x->lock)
>
> preemption
> taskC
> taskB
> lock(y->lock);
> x = y->x;
> if (!try_lock(x->lock)) {
> unlock(y->lock);
> boost(taskA);
> sched_yield(); <- returns immediately

So I'm still struggling with properly parsing the usecase.

If y->x might become invalid the moment we drop y->lock, what makes the 'taskA'
use (after we've dropped y->lock) safe? Shouldn't we at least also have a
task_get(taskA)/task_put(taskA) reference count, to make sure the boosted task
stays around?

And if we are into getting reference counts, why not solve it at a higher level
and get a reference count to 'x' to make sure it's safe to use? Then we could do:

lock(y->lock);
retry:
x = y->x;
if (!trylock(x->lock)) {
get_ref(x->count)
unlock(y->lock);
lock(x->lock);
lock(y->lock);
put_ref(x->count);
if (y->x != x) { /* Retry if 'x' got dropped meanwhile */
unlock(x->lock);
goto retry;
}
}

Or so.

Note how much safer this sequence is, and still just as fast in the common case
(which I suppose is the main motivation within dcache.c?).

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/