Re: [PATCH v2] lockdown,selinux: avoid bogus SELinux lockdown permission checks

From: Ondrej Mosnacek
Date: Fri May 28 2021 - 09:42:38 EST


(I'm off work today and plan to reply also to Paul's comments next
week, but for now let me at least share a couple quick thoughts on
Daniel's patch.)

On Fri, May 28, 2021 at 11:56 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>
> On 5/28/21 9:09 AM, Daniel Borkmann wrote:
> > On 5/28/21 3:37 AM, Paul Moore wrote:
> >> On Mon, May 17, 2021 at 5:22 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
> >>>
> >>> Commit 59438b46471a ("security,lockdown,selinux: implement SELinux
> >>> lockdown") added an implementation of the locked_down LSM hook to
> >>> SELinux, with the aim to restrict which domains are allowed to perform
> >>> operations that would breach lockdown.
> >>>
> >>> However, in several places the security_locked_down() hook is called in
> >>> situations where the current task isn't doing any action that would
> >>> directly breach lockdown, leading to SELinux checks that are basically
> >>> bogus.
> >>>
> >>> Since in most of these situations converting the callers such that
> >>> security_locked_down() is called in a context where the current task
> >>> would be meaningful for SELinux is impossible or very non-trivial (and
> >>> could lead to TOCTOU issues for the classic Lockdown LSM
> >>> implementation), fix this by modifying the hook to accept a struct cred
> >>> pointer as argument, where NULL will be interpreted as a request for a
> >>> "global", task-independent lockdown decision only. Then modify SELinux
> >>> to ignore calls with cred == NULL.
> >>
> >> I'm not overly excited about skipping the access check when cred is
> >> NULL. Based on the description and the little bit that I've dug into
> >> thus far it looks like using SECINITSID_KERNEL as the subject would be
> >> much more appropriate. *Something* (the kernel in most of the
> >> relevant cases it looks like) is requesting that a potentially
> >> sensitive disclosure be made, and ignoring it seems like the wrong
> >> thing to do. Leaving the access control intact also provides a nice
> >> avenue to audit these requests should users want to do that.
> >
> > I think the rationale/workaround for ignoring calls with cred == NULL (or the previous
> > patch with the unimplemented hook) from Ondrej was two-fold, at least speaking for his
> > seen tracing cases:
> >
> > i) The audit events that are triggered due to calls to security_locked_down()
> > can OOM kill a machine, see below details [0].
> >
> > ii) It seems to be causing a deadlock via slow_avc_audit() -> audit_log_end()
> > when presumingly trying to wake up kauditd [1].

Actually, I wasn't aware of the deadlock... But calling an LSM hook
[that is backed by a SELinux access check] from within a BPF helper is
calling for all kinds of trouble, so I'm not surprised :)

> Ondrej / Paul / Jiri: at least for the BPF tracing case specifically (I haven't looked
> at the rest but it's also kind of independent), the attached fix should address both
> reported issues, please take a look & test.

Thanks, I like this solution, although there are a few gotchas:

1. This patch creates a slight "regression" in that if someone flips
the Lockdown LSM into lockdown mode on runtime, existing (already
loaded) BPF programs will still be able to call the
confidentiality-breaching helpers, while before the lockdown would
apply also to them. Personally, I don't think it's a big deal (and I
bet there are other existing cases where some handle kept from before
lockdown could leak data), but I wanted to mention it in case someone
thinks the opposite.

2. IIUC. when a BPF program is rejected due to lockdown/SELinux, the
kernel will return -EINVAL to userspace (looking at
check_helper_call() in kernel/bpf/verifier.c; didn't have time to look
at other callers...). It would be nicer if the error code from the
security_locked_down() call would be passed through the call chain and
eventually returned to the caller. It should be relatively
straightforward to convert bpf_base_func_proto() to return a PTR_ERR()
instead of NULL on error, but it looks like this would result in quite
a big patch updating all the callers (and callers of callers, etc.)
with a not-so-small chance of missing some NULL check and introducing
a bug... I guess we could live with EINVAL-on-denied in stable kernels
and only have the error path refactoring in -next; I'm not sure...

3. This is a bit of a shot-in-the-dark, but I suppose there might be
some BPF programs that would be able to do something useful also when
the read_kernel helpers return an error, yet the kernel will now
outright refuse to load them (when the lockdown hook returns nonzero).
I have no idea if such BPF programs realistically exist in practice,
but perhaps it would be worth returning some dummy
always-error-returning helper function instead of NULL from
bpf_base_func_proto() when security_locked_down() returns an error.
That would also resolve (2.), basically. (Then there is the question
of what error code to use (because Lockdown LSM uses -EPERM, while
SELinux -EACCESS), but I think always returning -EPERM from such stub
helpers would be a viable choice.)

--
Ondrej Mosnacek
Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.