Re: [RFC v3 03/22] bpf,landlock: Add a new arraymap type to deal with (Landlock) handles
From: MickaÃl SalaÃn
Date: Wed Oct 05 2016 - 18:04:14 EST
On 04/10/2016 01:53, Kees Cook wrote:
> On Wed, Sep 14, 2016 at 12:23 AM, MickaÃl SalaÃn <mic@xxxxxxxxxxx> wrote:
>> This new arraymap looks like a set and brings new properties:
>> * strong typing of entries: the eBPF functions get the array type of
>> elements instead of CONST_PTR_TO_MAP (e.g.
>> CONST_PTR_TO_LANDLOCK_HANDLE_FS);
>> * force sequential filling (i.e. replace or append-only update), which
>> allow quick browsing of all entries.
>>
>> This strong typing is useful to statically check if the content of a map
>> can be passed to an eBPF function. For example, Landlock use it to store
>> and manage kernel objects (e.g. struct file) instead of dealing with
>> userland raw data. This improve efficiency and ensure that an eBPF
>> program can only call functions with the right high-level arguments.
>>
>> The enum bpf_map_handle_type list low-level types (e.g.
>> BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) which are identified when
>> updating a map entry (handle). This handle types are used to infer a
>> high-level arraymap type which are listed in enum bpf_map_array_type
>> (e.g. BPF_MAP_ARRAY_TYPE_LANDLOCK_FS).
>>
>> For now, this new arraymap is only used by Landlock LSM (cf. next
>> commits) but it could be useful for other needs.
>>
>> Changes since v2:
>> * add a RLIMIT_NOFILE-based limit to the maximum number of arraymap
>> handle entries (suggested by Andy Lutomirski)
>> * remove useless checks
>>
>> Changes since v1:
>> * arraymap of handles replace custom checker groups
>> * simpler userland API
>>
>> Signed-off-by: MickaÃl SalaÃn <mic@xxxxxxxxxxx>
>> Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
>> Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
>> Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
>> Cc: David S. Miller <davem@xxxxxxxxxxxxx>
>> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
>> Link: https://lkml.kernel.org/r/CALCETrWwTiz3kZTkEgOW24-DvhQq6LftwEXh77FD2G5o71yD7g@xxxxxxxxxxxxxx
>> ---
>> include/linux/bpf.h | 14 ++++
>> include/uapi/linux/bpf.h | 18 +++++
>> kernel/bpf/arraymap.c | 203 +++++++++++++++++++++++++++++++++++++++++++++++
>> kernel/bpf/verifier.c | 12 ++-
>> 4 files changed, 246 insertions(+), 1 deletion(-)
>>
>> [...]
>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>> index a2ac051c342f..94256597eacd 100644
>> --- a/kernel/bpf/arraymap.c
>> +++ b/kernel/bpf/arraymap.c
>> [...]
>> + /*
>> + * Limit number of entries in an arraymap of handles to the maximum
>> + * number of open files for the current process. The maximum number of
>> + * handle entries (including all arraymaps) for a process is then
>> + * (RLIMIT_NOFILE - 1) * RLIMIT_NOFILE. If the process' RLIMIT_NOFILE
>> + * is 0, then any entry update is forbidden.
>> + *
>> + * An eBPF program can inherit all the arraymap FD. The worse case is
>> + * to fill a bunch of arraymaps, create an eBPF program, close the
>> + * arraymap FDs, and start again. The maximum number of arraymap
>> + * entries can then be close to RLIMIT_NOFILE^3.
>> + *
>> + * FIXME: This should be improved... any idea?
>> + */
>> + if (unlikely(index >= rlimit(RLIMIT_NOFILE)))
>> + return -EMFILE;
>
> I'm not sure what's best for resource management here. Landlock will
> be holding open path structs, for example, but how are you expecting
> to track things like network policies? An allowed IP address, for
> example, doesn't have a handle outside of doing a full
> socket()/connect() setup.
Path and file references are hard to handle correctly but other things
should be simpler. External resources (i.e. not relative to the running
system as paths are) like network hosts or ports could simply be
expressed as raw values (like used for iptables rules). Moreover, for
network rules, relying on raw packet values (as
BPF_PROG_TYPE_SOCKET_FILTER have access to) may be more than enough.
>
> I think an explicit design for resource management should be
> considered up front...
I'm not really sure how to handle that partâ
There is basically two ways to express a "kernel object": relative (with
an internal pointer to a struct, e.g. struct file) or absolute (a raw
value). Both of them use kernel memory. However, only the former may
impact other parts of the kernel (e.g. can force to hold a kernel object
like a struct dentry). The impact of this is not clear for me but it
looks like other resource managements for a process: number of open
files, number of network connectionsâ
The more reasonable approach seems to charge the user for the (kernel)
memory dedicated to the user's policy. How can I do it? Maybe to
decrement the RLIMIT_NPROC and check the RLIMIT_AS (i.e. act like a
virtual process)?
There is no such limits with other eBPF maps (even those dealing with
FD), so this may be too much.
MickaÃl
Attachment:
signature.asc
Description: OpenPGP digital signature