Re: PI patch against 2.6.16-rt9

From: Esben Nielsen
Date: Tue Mar 28 2006 - 16:14:37 EST


On Tue, 28 Mar 2006, Ingo Molnar wrote:

>
> * Esben Nielsen <simlo@xxxxxxxxxx> wrote:
>
> > > we are observing a non-time-coherent snapshot of the locking graph.
> > > There is no guarantee that due to timeouts or signals the chain we
> > > observe isnt artificially long - while if a time-coherent snapshot is
> > > taken it is always fine. E.g. lets take dentry locks as an example:
> > > their locking is ordered by the dentry (kernel-pointer) address. We
> > > could in theory have a 'chain' of subsequent locking dependencies
> > > related to 10,000 dentries, which are nicely ordered and create a
> > > 10,000-entry 'chain' if looked at in a non-time-coherent form. I.e. your
> > > code could detect a deadlock where there's none. The more CPUs there
> > > are, the larger the likelyhood is that other CPUs 'lure us' into a long
> > > chain.
> >
> > I don't quite understand you examble: Are all 10,000 held at once?
>
> no.
>
> > If no, how are they all going to suddenly put into the lock chain due
> > to signals or timeouts? Those things unlocks locks and therefore
> > breaks the chain.
>
> the core problem with your approach is that for each step in the
> 'boosting chain' (which can be quite long in theory), all that we are
> holding is a task reference get get_task_struct(), to a task that was
> blocked before. We then make ourselves preemptible - and once get get
> back and continue the boosting chain, there is no guarantee that the
> boosting makes any sense! Normally that task will probably still be
> blocked, and we continue with our boosting. But the task could have
> gotten unblocked, it could have gotten re-blocked, and we'd continue
> doing the boosting.
>
> in short: wow do you ensure that the boosting is still part of the same
> dependency chain where it started off?
>

I don't insure that. But does it matter?!?
If the task is still blocked on a lock and the owner of that lock might
need boosting. The boosting operation itself will always be _correct_ as the
pi_lock is held when it is done. But the task doing the boosting might have
preempted for so long that there is nothing left to do - and then it
simply stops unless deadlock detection is on.

I think we talk about the situation

B locks 1 C locks 2 D locks 3
B locks 2, boosts C and block
A locks 2
A is boost B
A drop it's spinlocks and is preempted
C unlocks 2 and auto unboosts
B is running
B locks 3, boosts C and blocks
A gets a CPU again
A boosts B
A boosts D

Is there anything wrong with that?
And in the case where A==D there indeed is a deadlock which will be
detected.

Esben



> Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/