Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature

From: Byungchul Park
Date: Sun Sep 18 2016 - 22:44:45 EST

Next message: Dave Chinner: "Re: Linux 4.8: Reported regressions as of Sunday, 2016-09-18"
Previous message: Xishi Qiu: "[question] hugetlb: how to find who use hugetlb?"
Next in thread: Peter Zijlstra: "Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Sep 14, 2016 at 10:11:17AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 14, 2016 at 11:27:22AM +0900, Byungchul Park wrote:
> > > Well, there is, its just not trivially observable. We must be able to
> > > acquire a in order to complete b, therefore there is a dependency.
> >
> > No. We cannot say there is a dependency unconditionally. There can
> > be a dependency or not.
> >
> > L a L a
> > U a
> > ~~~~~~~~~ what if serialized by something?
>
> Well, there's no serialization in the example, so no what if.

It was a korean traditional holliday for a week so I'm late.

I mean we cannot _ensure_ there's no serialization while lockdep works.
In _the_ case you suggested, you'are right if only those code exists.
But it's meaningless.

> > W b C b
> >
> > If something we don't recognize serializes locks, which ensures
> > 'W b' happens after 'L a , U a' in the other context, then there's
> > no dependency here.
>
> Its not there.
>
> > We should say 'b depends on a' in only case that the sequence
> > 'W b and then L a and then C b, where last two ops are in same
> > context' _actually_ happened at least once. Otherwise, it might
> > add a false dependency.
> >
> > It's same as how original lockdep works with typical locks. It adds
> > a dependency only when a lock is actually hit.
>
> But since these threads are independently scheduled there is no point in
> transferring the point in time thread A does W to thread B. There is no
> relation there.
>
> B could have already executed the complete or it could not yet have
> started execution at all or anything in between, entirely random.

Of course B could have already executed the complete or it could not yet
have started execution at all or anything in between. But it's not entirely
random.

It might be a random point since they are independently scheduled, but it's
not entirely random. And it's a random point among valid points which lockdep
needs to consider. For example,

CONTEXT 1 CONTEXT 2(forked one)
========= =====================
(a) acquire F
acquire A acquire G
acquire B wait_for_completion Z
acquire C
(b) acquire H
fork 2 acquire I
acquire D acquire J
complete Z acquire K

I can provide countless examples with which I can say you're wrong.
In this case, all acquires between (a) and (b) must be ignored when
generating dependencies with complete operation of Z. It's never random.

Ideally, it would be of course the best to consider all points (not random
points) after (b) which are valid points which lockdep needs to work with.
But I think it's impossible to parse and identify all synchronizations and
forks in kernel code, furthermore, new synchronization interface can be
introduced in future.

So IMHO it would be the second best to consider random points among valid
points, which anyway actually happened so it's guarrented that it has a
depenency with Z.

It's similar to how lockdep works for typical lock e.g. spin lock. Current
lockdep builds dependecy graph based on call paths which actually happened
in each context, which might be different from each run. Even current
lockdep doesn't parse all code and identify dependencies but works based on
actual call paths in runtime which can be random but will eventually cover
it almost (not perfect).

> > > What does that mean? Any why? This is a random point in time without
> > > actual meaning.
> >
> > It's not random point. We have to consider meaningful sequences among
> > those which are globally observable. That's why we need to serialize
> > those locks.
>
> Serialize how? there is no serialization.

I mean I did it in my crossrelease implementation.

>
> > For example,
> >
> > W b
> > L a
> > U a
> > C b
> >
> > Once this sequence is observable globally, we can say 'It's possible to
> > run in this sequence. Is this sequence problematic or not?'.
> >
> > L a
> > U a
> > W b
> > C b
> >
> > If only this sequence can be observable, we should not assume
> > this sequence can be changed. However once the former sequence
> > happens, it has a possibility to hit the same sequence again later.
> > So we can check deadlock possibility with the sequence,
> >
> > _not randomly_.
>
> I still don't get it.
>
> > We need to connect between the crosslock and the first lock among
> > locks having been acquired since the crosslock was held.
>
> Which can be _any_ lock in the history of that thread. It could be
> rq->lock from getting the thread scheduled.

I think I already answered it. Right?

>
> > Others will be
> > connected each other by original lockdep.
> >
> > By the way, does my document miss this description? If so, sorry.
> > I will check and update it.
>
> I couldn't find anything useful, but then I could not understand most of
> what was written, and I tried hard :-(

Thank you for trying it.

Next message: Dave Chinner: "Re: Linux 4.8: Reported regressions as of Sunday, 2016-09-18"
Previous message: Xishi Qiu: "[question] hugetlb: how to find who use hugetlb?"
Next in thread: Peter Zijlstra: "Re: [PATCH v3 07/15] lockdep: Implement crossrelease feature"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]