Re: [PATCH v3 03/22] perf dso: Make lock error check and add BUG_ONs

From: Arnaldo Carvalho de Melo
Date: Sat Feb 12 2022 - 10:50:03 EST


Em Sat, Feb 12, 2022 at 12:48:37PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Feb 11, 2022 at 11:35:05AM -0800, Ian Rogers escreveu:
> > On Fri, Feb 11, 2022 at 11:21 AM Arnaldo Carvalho de Melo
> > <acme@xxxxxxxxxx> wrote:
> > >
> > > Em Fri, Feb 11, 2022 at 09:43:19AM -0800, Ian Rogers escreveu:
> > > > On Fri, Feb 11, 2022 at 9:13 AM Arnaldo Carvalho de Melo
> > > > <acme@xxxxxxxxxx> wrote:
> > > > >
> > > > > Em Fri, Feb 11, 2022 at 02:33:56AM -0800, Ian Rogers escreveu:
> > > > > > Make the pthread mutex on dso use the error check type. This allows
> > > > > > deadlock checking via the return type. Assert the returned value from
> > > > > > mutex lock is always 0.
> > > > >
> > > > > I think this is too blunt/pervasive source code wise, perhaps we should
> > > > > wrap this like its done with rwsem in tools/perf/util/rwsem.h to get
> > > > > away from pthreads primitives and make the source code look more like
> > > > > a kernel one and then, taking advantage of the so far ideologic
> > > > > needless indirection, add this BUG_ON if we build with "DEBUG=1" or
> > > > > something, wdyt?
> > > >
> > >
> > > > My concern with semaphores is that they are a concurrency primitive
> > >
> > > I'm not suggesting we switch over to semaphores, just to use the same
> > > technique of wrapping pthread_mutex_t with some other API that then
> > > allows us to add these BUG_ON() calls without polluting the source code
> > > in many places.
> >
> > Sounds simple enough and would ensure consistency too. I can add it to
> > the front of this set of changes. A different approach would be to
> > take what's here and then refactor and cleanup as a follow on patch
> > set. I'd prefer that as the size of this set of changes is already
> > larger than I like - albeit that most of it is just introducing the
>
> So, the first 4 patches in this series were already merged, as they are
> just prep work that don't add clutter, having those in the front of the
> patchkit helps picking up the low hanging fruit.

Forgot to mention, I merged, tested and alreay published it in
perf/core, i.e. no more rebases for that lot, that is how it will get
into 5.18.

Alexei's threaded record patchkit is there as well, BTW, so should help
reducing the possibility of clashes with your (and others) work.

- Arnaldo

> I usually try to pick even if it comes later, to make progress, I'll
> recheck the rest of the patchkit to see what more I can pick to reduce
> its size.
>
> - Arnaldo
>
> > use of functions to access struct variables. Perhaps I just remove the
> > BUG_ON and pthread changes here, we work to get this landed and in a
> > separate set of patches clean up the pthread mutex code to have better
> > bug checking.
> >
> > Thanks,
> > Ian
> >
> > > - Arnaldo
> > >
> > > > that has more flexibility and power than a mutex. I like a mutex as it
> > > > is quite obvious what is going on and that is good from a tooling
> > > > point of view. A deadlock with two mutexes is easy to understand. On a
> > > > semaphore, were we using it like a condition variable? There's more to
> > > > figure out. I also like the idea of compiling the perf command with
> > > > emscripten, we could then generate say perf annotate output in your
> > > > web browser. Emscripten has implementations of standard posix
> > > > libraries including pthreads, but we may need to have two approaches
> > > > in the perf code if we want to compile with emscripten and use
> > > > semaphores when targeting linux.
> > > >
> > > > Where this change comes from is that I worried that extending the
> > > > locked regions to cover the race that'd been found would then expose
> > > > the kind of recursive deadlock that pthread mutexes all too willingly
> > > > allow. With this code we at least see the bug and don't just hang. I
> > > > don't think we need the change to the mutexes for this change, but we
> > > > do need to extend the regions to fix the data race.
> > > >
> > > > Let me know how you prefer it and I can roll it into a v4 version.
> > > >
> > > > Thanks,
> > > > Ian
> > > >
> > > > > - Arnaldo
> > > > >
> > > > > > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > > > > > ---
> > > > > > tools/perf/util/dso.c | 12 +++++++++---
> > > > > > tools/perf/util/symbol.c | 2 +-
> > > > > > 2 files changed, 10 insertions(+), 4 deletions(-)
> > > > > >
> > > > > > diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> > > > > > index 9cc8a1772b4b..6beccffeef7b 100644
> > > > > > --- a/tools/perf/util/dso.c
> > > > > > +++ b/tools/perf/util/dso.c
> > > > > > @@ -784,7 +784,7 @@ dso_cache__free(struct dso *dso)
> > > > > > struct rb_root *root = &dso->data.cache;
> > > > > > struct rb_node *next = rb_first(root);
> > > > > >
> > > > > > - pthread_mutex_lock(&dso->lock);
> > > > > > + BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
> > > > > > while (next) {
> > > > > > struct dso_cache *cache;
> > > > > >
> > > > > > @@ -830,7 +830,7 @@ dso_cache__insert(struct dso *dso, struct dso_cache *new)
> > > > > > struct dso_cache *cache;
> > > > > > u64 offset = new->offset;
> > > > > >
> > > > > > - pthread_mutex_lock(&dso->lock);
> > > > > > + BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
> > > > > > while (*p != NULL) {
> > > > > > u64 end;
> > > > > >
> > > > > > @@ -1259,6 +1259,8 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
> > > > > > struct dso *dso = calloc(1, sizeof(*dso) + strlen(name) + 1);
> > > > > >
> > > > > > if (dso != NULL) {
> > > > > > + pthread_mutexattr_t lock_attr;
> > > > > > +
> > > > > > strcpy(dso->name, name);
> > > > > > if (id)
> > > > > > dso->id = *id;
> > > > > > @@ -1286,8 +1288,12 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
> > > > > > dso->root = NULL;
> > > > > > INIT_LIST_HEAD(&dso->node);
> > > > > > INIT_LIST_HEAD(&dso->data.open_entry);
> > > > > > - pthread_mutex_init(&dso->lock, NULL);
> > > > > > + pthread_mutexattr_init(&lock_attr);
> > > > > > + pthread_mutexattr_settype(&lock_attr, PTHREAD_MUTEX_ERRORCHECK);
> > > > > > + pthread_mutex_init(&dso->lock, &lock_attr);
> > > > > > + pthread_mutexattr_destroy(&lock_attr);
> > > > > > refcount_set(&dso->refcnt, 1);
> > > > > > +
> > > > > > }
> > > > > >
> > > > > > return dso;
> > > > > > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> > > > > > index b2ed3140a1fa..43f47532696f 100644
> > > > > > --- a/tools/perf/util/symbol.c
> > > > > > +++ b/tools/perf/util/symbol.c
> > > > > > @@ -1783,7 +1783,7 @@ int dso__load(struct dso *dso, struct map *map)
> > > > > > }
> > > > > >
> > > > > > nsinfo__mountns_enter(dso->nsinfo, &nsc);
> > > > > > - pthread_mutex_lock(&dso->lock);
> > > > > > + BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
> > > > > >
> > > > > > /* check again under the dso->lock */
> > > > > > if (dso__loaded(dso)) {
> > > > > > --
> > > > > > 2.35.1.265.g69c8d7142f-goog
> > > > >
> > > > > --
> > > > >
> > > > > - Arnaldo
> > >
> > > --
> > >
> > > - Arnaldo
>
> --
>
> - Arnaldo

--

- Arnaldo