Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

From: Byungchul Park
Date: Wed Sep 14 2016 - 00:49:35 EST


On Wed, Sep 14, 2016 at 11:27 AM, Byungchul Park
<max.byungchul.park@xxxxxxxxx> wrote:
> On Wed, Sep 14, 2016 at 4:38 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> On Wed, Sep 14, 2016 at 02:12:51AM +0900, Byungchul Park wrote:
>>> On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>> >
>>> >
>>> > So the idea is to add support for non-owner serialization primitives,
>>> > like completions/semaphores, and so far I've not looked at the code yet.
>>> > I did spend 2+ hours trying to decipher your documentation thing, but am
>>> > still confused, that thing is exceedingly hard to parse/read.
>>> >
>>> > So the typical scenario would be something like:
>>> >
>>> > L a L a
>>> > U a
>>> > W b C b
>>> >
>>> > where L := lock, U := unlock, W := wait_for_completion and C :=
>>> > complete.
>>> >
>>> > On the left, the blocking thread we can easily observe the 'b depends on
>>> > a' relation, since we block while holding a.
>>>
>>> I think 'a depends on b' relation.
>>>
>>> Why does b depend on a? b depends on a's what?
>>
>> b blocks while holding a. In any case, for the graph it doesn't matter,
>> its not a directed graph as such, all we care about is acyclic.
>
> It's a directed graph.The meaning of backward traverse is different
> from the meaning of forward traverse, isn't it?
>
>>> > On the right however that same relation is hard to see, since by the
>>>
>>> Yes, there's no dependency on the right side of your example.
>>
>> Well, there is, its just not trivially observable. We must be able to
>> acquire a in order to complete b, therefore there is a dependency.
>
> No. We cannot say there is a dependency unconditionally. There can
> be a dependency or not.
>
> L a L a
> U a
> ~~~~~~~~~ what if serialized by something?
> W b C b
>
> If something we don't recognize serializes locks, which ensures
> 'W b' happens after 'L a , U a' in the other context, then there's
> no dependency here.
>
> We should say 'b depends on a' in only case that the sequence
> 'W b and then L a and then C b, where last two ops are in same
> context' _actually_ happened at least once. Otherwise, it might
> add a false dependency.
>
> It's same as how original lockdep works with typical locks. It adds
> a dependency only when a lock is actually hit.
>
>>> > time we would run complete, a has already been released.
>>>
>>> I will change your example a little bit.
>>>
>>> W b
>>> ~~~~~~~ <- serialized
>>> L a
>>> U a
>>> C b
>>>
>>> (Remind crossrelease considers dependencies in only case that
>>> lock sequence serialized is observable.)
>>
>> What does that mean? Any why? This is a random point in time without
>> actual meaning.
>
> It's not random point. We have to consider meaningful sequences among
> those which are globally observable. That's why we need to serialize
> those locks. For example,
>
> W b
> L a
> U a
> C b
>
> Once this sequence is observable globally, we can say 'It's possible to
> run in this sequence. Is this sequence problematic or not?'.
>
> L a
> U a
> W b
> C b
>
> If only this sequence can be observable, we should not assume
> this sequence can be changed. However once the former sequence
> happens, it has a possibility to hit the same sequence again later.
> So we can check deadlock possibility with the sequence,
>
> _not randomly_.
>
>>> There's a dependency in this case, like 'b depends on a' relation.
>>> 'C b' may not be hit, depending on the result of 'L a', that is,
>>> acquired or not.
>>
>> With or without a random reference point, the dependency is there.
>
> The dependency might not be there.
>
>>> > I _think_ you propose to keep track of all prior held locks and then use
>>> > the union of the held list on the block-chain with the prior held list
>>> > from the complete context.
>>>
>>> Almost right. Only thing we need to do to consider the union is to
>>> connect two chains of two contexts by adding one dependency 'b -> a'.
>>
>> Sure, but how do you arrive at which connection to make. The document is
>> entirely silent on this crucial point.
>
> We need to connect between the crosslock and the first lock among
> locks having been acquired since the crosslock was held. Others will be
> connected each other by original lockdep.
>
> By the way, does my document miss this description? If so, sorry.
> I will check and update it.
>
>> The union between the held-locks of the blocked and prev-held-locks of
>> the release should give a fair indication I think, but then, I've not
>> thought too hard on this yet.
>
> I make the union only If actually the lock sequence happened once so
> that it's globally visible.
>
>>> > The document describes you have a prior held list, and that its a
>>> > circular thing, but it then completely fails to specify what gets added
>>> > and how its used.
>>>
>>> Why does it fail? It keeps them in the form of hlock. This data is enough
>>> to generate dependencies.
>>
>> It fails to explain. It just barely mentions you keep them, it doesn't
>> mention how they're used or why.
>
> Okay. I need to explain more about that.
>
>>> > Also, by it being a circular list (of indeterminate, but finite size),
>>> > there is the possibility of missing dependencies if they got spooled out
>>> > by recent activity.
>>>
>>> Yes, right. They might be missed. It means just missing some chances to
>>> check a deadlock. It's all. Better than do nothing.
>>
>> Sure, but you didn't specify. Again, the document is very ambiguous and
>> ill specified.
>
> Yes, I need to supplement the latter half as you said. I will do it.
>
>>
>>> > The document has an example in the section 'How cross release works'
>>> > (the 3rd) that simply cannot be. It lists lock action _after_ acquire,
>>>
>>> Could you tell me more? Why cannot it be?
>>
>> Once you block you cannot take more locks, that simply doesnt make
>> sense.
>
> No. Peter.. lockdep check possibilities. All lockdep operations are
> performed assuming this same sequence can happen again later but in
> a little bit different order between more than two contexts so that actual
> deadlock can happen then.

I can see what you are talking about. you are only talking about
wait_for_completion. wait_for_completion is a case more complicated
than lock_page. Could we talk about lock_page first if you don't mind?
And then move to wait_for_complete next?

Anyway even in wait_complete, It's possible like,
(even though it's not much meaningful in case of wait_for_completion)

(Triger complete context to start)
acquire A
wait_for_completion A
acquire another (e.g. wait.lock)
release it
...
(actually sleep)

Not much meaningful in wait_for_completion, though it's not wrong.
And this is much meaningful to another crosslock like lock_page.
The target I'm mainly focusing is page lock.

> No doubt, it cannot take more locks while blocked at the time. But it's
> not important for lockdep to work.
>
>>> > but you place the acquire in wait_for_completion. We _block_ there.
>>> >
>>> > Is this the general idea?
>>> >
>>> > If so, I cannot see how something like:
>>> >
>>> > W a W b
>>> > C b C a
>>>
>>> I didn't tell it works in this case. But it can be a future work. I'm not
>>> sure but I don't think making it work is impossible at all. But anyway
>>> current implementation cannot deal with this case.
>>
>> +4. 'crosslock a -> crosslock b' dependency
>> +
>> + Creating this kind of dependency directly is unnecessary since it can
>> + be covered by other kinds of dependencies.
>>
>> Says something different, doesn't it?
>
> Really sorry. I made you much confused. I will update it.
>
>>> > On the whole 'release context' thing you want to cast lockdep in, please
>>> > consider something like:
>>> >
>>> > L a L b
>>> > L b L a
>>> > U a U a
>>> > U b U b
>>> >
>>> > Note that the release order of locks is immaterial and can generate
>>> > 'wrong' dependencies. Note how on both sides a is released under b,
>>>
>>> Who said the release order is important? Wrong for what?
>>
>> Well, again, you mention 'release context' without actually specifying
>> what you mean with that.
>>
>> L a
>> L b
>> U b
>>
>> and
>>
>> L a
>> L b
>> U a
>>
>> Here the unlocks obviously have different context. Without further
>> specification its simply impossible to tell what you mean.
>
> I mentioned what the release context means in the document. Or
> I don't understand what you are asking now.
>
> --
> Thanks,
> Byungchul



--
Thanks,
Byungchul