Re: [PATCH v5 01/10] capabilities: introduce CAP_PERFMON to kernel and user space

From: Stephen Smalley
Date: Wed Feb 12 2020 - 10:44:26 EST


On 2/12/20 10:21 AM, Stephen Smalley wrote:
On 2/12/20 8:53 AM, Alexey Budankov wrote:
On 12.02.2020 16:32, Stephen Smalley wrote:
On 2/12/20 3:53 AM, Alexey Budankov wrote:
Hi Stephen,

On 22.01.2020 17:07, Stephen Smalley wrote:
On 1/22/20 5:45 AM, Alexey Budankov wrote:

On 21.01.2020 21:27, Alexey Budankov wrote:

On 21.01.2020 20:55, Alexei Starovoitov wrote:
On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov
<alexey.budankov@xxxxxxxxxxxxxxx> wrote:


On 21.01.2020 17:43, Stephen Smalley wrote:
On 1/20/20 6:23 AM, Alexey Budankov wrote:

<SNIP>
Introduce CAP_PERFMON capability designed to secure system performance

Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.

So far so good, I suggest using the simplest version for v6:

static inline bool perfmon_capable(void)
{
ÂÂÂÂÂÂreturn capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
}

It keeps the implementation simple and readable. The implementation is more
performant in the sense of calling the API - one capable() call for CAP_PERFMON
privileged process.

Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,
but this bloating also advertises and leverages using more secure CAP_PERFMON
based approach to use perf_event_open system call.

I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.

I am trying to reproduce this double logging with CAP_PERFMON.
I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.
When running perf stat -a I am observing this AVC audit messages:

type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1
type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1
type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1
type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1

However there is no capability related messages around. I suppose my refpolicy should
be modified somehow to observe capability related AVCs.

Could you please comment or clarify on how to enable caps related AVCs in order
to test the concerned logging.

The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.

Yes, I already have it like this:
common cap2
{
<------>mac_override<--># unused by SELinux
<------>mac_admin
<------>syslog
<------>wake_alarm
<------>block_suspend
<------>audit_read
<------>perfmon
}

dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials.
BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it.

Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module.

Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that).