Re: [PATCH v2] perf cap: Tidy up and improve capability testing

From: Ian Rogers
Date: Tue Jul 30 2024 - 12:59:00 EST

Next message: Shiju Jose: "RE: [PATCH v10 01/11] EDAC: Add generic EDAC RAS control feature driver"
Previous message: Rob Herring: "Re: [PATCH v6 02/27] dt-bindings: mfd: syscon: add microchip's sam9x7 sfr"
In reply to: Namhyung Kim: "Re: [PATCH v2] perf cap: Tidy up and improve capability testing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Jul 30, 2024 at 9:36 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> On Mon, Jul 29, 2024 at 11:19:31AM -0700, Ian Rogers wrote:
> > Remove dependence on libcap. libcap is only used to query whether a
> > capability is supported, which is just 1 capget system call.
> >
> > If the capget system call fails, fall back on root permission
> > checking. Previously if libcap fails then the permission is assumed
> > not present which may be pessimistic/wrong.
> >
> > Add a used_root out argument to perf_cap__capable to say whether the
> > fall back root check was used. This allows the correct error message,
> > "root" vs "users with the CAP_PERFMON or CAP_SYS_ADMIN capability", to
> > be selected.
> >
> > Tidy uses of perf_cap__capable so that tests aren't repeated if capget
> > isn't supported, to reduce calls or refactor similar to:
> > https://lore.kernel.org/lkml/20240729004127.238611-3-namhyung@xxxxxxxxxx/
>
> I'm not familiar with the capability so it's hard to review the code but
> it'd be better to split the code for perf_cap__capable() and its usage.

There's an API change, passing the out arg if root checking was used,
so this would turn into a mess.

> > ---
> > v2: fix syscall number and '>' should have been '>='
> >
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > tools/perf/Makefile.config | 11 -------
> > tools/perf/builtin-ftrace.c | 44 ++++++++++++--------------
> > tools/perf/util/Build | 2 +-
> > tools/perf/util/cap.c | 63 ++++++++++++++++++++++++++-----------
> > tools/perf/util/cap.h | 23 ++------------
> > tools/perf/util/symbol.c | 8 ++---
> > tools/perf/util/util.c | 12 +++++--
> > 7 files changed, 81 insertions(+), 82 deletions(-)
[snip]
> > --- a/tools/perf/util/cap.c
> > +++ b/tools/perf/util/cap.c
[snip]
> > +bool perf_cap__capable(int cap, bool *used_root)
> > +{
> > + struct __user_cap_header_struct header = {
> > + .version = _LINUX_CAPABILITY_VERSION_3,
> > + .pid = getpid(),
> > + };
> > + struct __user_cap_data_struct data[MAX_LINUX_CAPABILITY_U32S];
> > + __u32 cap_val;
> > +
> > + *used_root = false;
> > + while (syscall(SYS_capget, &header, &data[0]) == -1) {
> > + /* Retry, first attempt has set the header.version correctly. */
> > + if (errno == EINVAL && header.version != _LINUX_CAPABILITY_VERSION_3 &&
> > + header.version == _LINUX_CAPABILITY_VERSION_1)
>
> It seems you can just check it with _VERSION1.

Similarly not familiar. Basically there used to be a data struct of 3
u32s in v1, v2 shouldn't exist, v3 turned the data from an array of
structs of size [1] to [2]. v3 has been present for 16 years as shown
by its magic number 0x20080522 (v2.6.26 iirc - can't find the comment
again). It seems a pretty safe bet that v3 is the version being used.
In old kernels if the versions don't match they just return EINVAL and
you have to try again like this loop:
https://fossd.anu.edu.au/linux/v2.6.14/source/kernel/capability.c#L43
```
if (get_user(version, &header->version))
return -EFAULT;

if (version != _LINUX_CAPABILITY_VERSION) {
if (put_user(_LINUX_CAPABILITY_VERSION, &header->version))
return -EFAULT;
return -EINVAL;
}
```
So testing v3 is almost certainly what we want, what's going to work
and avoids a loop. In newer kernels the EINVAL isn't returned for a
mismatch and the code just does the right thing for whatever version
is requested.

> But I'm not sure about what this code does. Who set the version
> correctly? Is there any chance for an infinite loop?

The system call could return EINVAL with _LINUX_CAPABILITY_VERSION_1
and set the header.version to _LINUX_CAPABILITY_VERSION_1 again and
return EINVAL again, the kernel would be broken but it would lead to
an infinite loop.

Thanks,
Ian

Next message: Shiju Jose: "RE: [PATCH v10 01/11] EDAC: Add generic EDAC RAS control feature driver"
Previous message: Rob Herring: "Re: [PATCH v6 02/27] dt-bindings: mfd: syscon: add microchip's sam9x7 sfr"
In reply to: Namhyung Kim: "Re: [PATCH v2] perf cap: Tidy up and improve capability testing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]