On Mon, May 31, 2021 at 4:24 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
On 5/29/21 8:48 PM, Paul Moore wrote:
Daniel's patch side steps that worry by just doing the lockdown
permission check when the BPF program is loaded, but that isn't a
great solution if the policy changes afterward. I was hoping there
might be some way to perform the permission check as needed, but the
more I look the more that appears to be difficult, if not impossible
(once again, corrections are welcome).
Your observation is correct, will try to clarify below a bit.
I'm now wondering if the right solution here is to make use of the LSM
notifier mechanism. I'm not yet entirely sure if this would work from
a BPF perspective, but I could envision the BPF subsystem registering
a LSM notification callback via register_blocking_lsm_notifier(), see
if Infiniband code as an example, and then when the LSM(s) policy
changes the BPF subsystem would get a notification and it could
revalidate the existing BPF programs and take block/remove/whatever
the offending BPF programs. This obviously requires a few things
which I'm not sure are easily done, or even possible:
1. Somehow the BPF programs would need to be "marked" at
load/verification time with respect to their lockdown requirements so
that decisions can be made later. Perhaps a flag in bpf_prog_aux?
2. While it looks like it should be possible to iterate over all of
the loaded BPF programs in the LSM notifier callback via
idr_for_each(prog_idr, ...), it is not clear to me if it is possible
to safely remove, or somehow disable, BPF programs once they have been
loaded. Hopefully the BPF folks can help answer that question.
3. Disabling of BPF programs might be preferable to removing them
entirely on LSM policy changes as it would be possible to make the
lockdown state less restrictive at a future point in time, allowing
for the BPF program to be executed again. Once again, not sure if
this is even possible.
Part of why this gets really complex/impossible is that BPF programs in
the kernel are reference counted from various sides, be it that there
are references from user space to them (fd from application, BPF fs, or
BPF links), hooks where they are attached to as well as tail call maps
where one BPF prog calls into another. There is currently also no global
infra of some sort where you could piggy back to atomically keep track of
all the references in a list or such. And the other thing is that BPF progs
have no ownership that is tied to a specific task after they have been
loaded. Meaning, once they are loaded into the kernel by an application
and attached to a specific hook, they can remain there potentially until
reboot of the node, so lifecycle of the user space application != lifecycle
of the BPF program.
I don't think the disjoint lifecycle or lack of task ownership is a
deal breaker from a LSM perspective as the LSMs can stash whatever
info they need in the security pointer during the program allocation
hook, e.g. selinux_bpf_prog_alloc() saves the security domain which
allocates/loads the BPF program.
The thing I'm worried about would be the case where a LSM policy
change requires that an existing BPF program be removed or disabled.
I'm guessing based on the refcounting that there is not presently a
clean way to remove a BPF program from the system, but is this
something we could resolve? If we can't safely remove a BPF program
from the system, can we replace/swap it with an empty/NULL BPF