[PATCH] perf: fallback to opening dso from outside of mount namesp=

From: Ivan Babrou
Date: Thu Dec 05 2019 - 19:27:48 EST


Some tasks enter mount namespace for isolation and this fallback
allows perf to read symbols from binaries that live outside of
mount namespace of the running task.

Signed-off-by: Ivan Babrou <ivan@xxxxxxxxxxxxxx>
---
tools/perf/util/dso.c | 7 +++++++
tools/perf/util/symbol.c | 20 +++++++++++++++-----
2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index e11ddf86f2b3..dac6bf42e43e 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -527,6 +527,13 @@ static int open_dso(struct dso *dso, struct
machine *machine)
fd =3D __open_dso(dso, machine);
if (dso->binary_type !=3D DSO_BINARY_TYPE__BUILD_ID_CACHE)
nsinfo__mountns_exit(&nsc);
+
+ if (fd < 0) {
+ fd =3D __open_dso(dso, machine);
+ if (fd >=3D 0) {
+ pr_warning("Using debug info for %s from
outside of its active mount namespace.\n", dso->long_name);
+ }
+ }

if (fd >=3D 0) {
dso__list_add(dso);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a8f80e427674..e85d57dfcc14 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1679,11 +1679,21 @@ int dso__load(struct dso *dso, struct map *map)
* Read the build id if possible. This is required for
* DSO_BINARY_TYPE__BUILDID_DEBUGINFO to work
*/
- if (!dso->has_build_id &&
- is_regular_file(dso->long_name)) {
- __symbol__join_symfs(name, PATH_MAX, dso->long_name);
- if (filename__read_build_id(name, build_id, BUILD_ID_SIZE) > 0=
)
- dso__set_build_id(dso, build_id);
+ if (!dso->has_build_id) {
+ bool is_reg =3D is_regular_file(dso->long_name);
+ if (!is_reg) {
+ nsinfo__mountns_exit(&nsc);
+ is_reg =3D is_regular_file(dso->long_name);
+ if (!is_reg) {
+ nsinfo__mountns_enter(dso->nsinfo, &nsc);
+ }
+ }
+
+ if (is_reg) {
+ __symbol__join_symfs(name, PATH_MAX, dso->long_name);
+ if (filename__read_build_id(name, build_id, BUILD_ID_SIZE)=
> 0)
+ dso__set_build_id(dso, build_id);
+ }
}

/*
--
2.24.0

/*

--
2.24.0

On Thu, Dec 5, 2019 at 4:33 AM Arnaldo Carvalho de Melo
<arnaldo.melo@xxxxxxxxx> wrote:
>
> Em Wed, Dec 04, 2019 at 07:46:10PM -0800, Ivan Babrou escreveu:
> > We have a service that forks a child process in a namespace-based
> > sandbox where the mount namespace is intentionally designed to reflect
> > a totally empty filesystem. Our use case is very similar to Chrome's
> > sandbox, for example, but on a server. Within the sandbox, not even
> > the service's own binary is present in the mount namespace.
> >
> > Process tree looks like this:
> >
> > $ sudo pstree -psc 63989
> > edgeworker(63989)=E2=94=80=E2=94=AC=E2=94=80edgeworker/sbox(255716)=E2=
=94=80=E2=94=AC=E2=94=80edgeworker/zygt(255718)
> > =E2=94=82 =E2=94=9C=E2=94=80=
{edgeworker/sbox}(255719)
> > =E2=94=82 =E2=94=9C=E2=94=80=
{edgeworker/sbox}(255720)
> > =E2=94=82 =E2=94=9C=E2=94=80=
{edgeworker/sbox}(255721)
> > =E2=94=9C=E2=94=80edgeworker/stry(5803)
> > =E2=94=9C=E2=94=80edgeworker/stry(63990)
> > =E2=94=9C=E2=94=80edgeworker/stry(106218)
> > =E2=94=9C=E2=94=80edgeworker/stry(191905)
> > =E2=94=9C=E2=94=80edgeworker/stry(255695)
> > =E2=94=9C=E2=94=80edgeworker/supr(255717)
> >
> > Here sbox processes do actual work living in an empty mount namespaces
> > and stry is a helper process for error reporting. All tasks come from
> > the same binary that lives in the root mount namespace, launched by
> > systemd.
> >
> > During "perf script" run on a trace obtained from the system there are
> > these possible outcomes:
> >
> > 1. The first pid to be processed is a non-namespaced helper and
> > symbols are present.
> > 2. The first pid is not found and symbols are present.
> > 3. The first pid is a sandboxed task and symbols are missing.
> >
> > Symbols are missing, because "perf script" tries to jump into an empty
> > sandbox and find a binary there, when in fact it lives outside:
> >
> > getcwd("/state/home/ivan", 4096) =3D 17
> > open("/proc/self/ns/mnt", O_RDONLY) =3D 5
> > open("/proc/255719/ns/mnt", O_RDONLY) =3D 6
> > setns(6, CLONE_NEWNS) =3D 0
> > stat("/usr/local/bin/edgeworker", 0x7ffedb9b3ca0) =3D -1 ENOENT (No suc=
h
> > file or directory)
> >
> > In the second outcome we don't have a PID to figure out the namespace
> > to jump into, so this doesn't happen. It's a good fallback, but it was
> > a bit confusing during debugging.
> >
> > It's not entirely clear to me why sometimes a helper PID is picked,
> > even though it's not the first sample in the recorded trace (at least
> > not in the output). This happens deterministically, or at least
> > appears so. In my process tree it's 255695.
> >
> > I think perf should try to fallback to the default namespace to look
> > up symbols if they are not found inside to cover our case. Relevant
> > piece of logic is here:
>
> That should work for your use case, as you're sure that looking up by
> pathname only will find, outside the namespace, the binary you want.
>
> Even with pathname based looukups being fragile, it works for your
> usecase, so please consider providing a patch for such fallback,
> together with a pr_debug() or even pr_warning() if this don't get too
> noisy, to warn the user.
>
> - Arnaldo
>
> > * https://elixir.free-electrons.com/linux/v5.4.1/source/tools/perf/util=
/dso.c#L520
>
> --
>
> - Arnaldo