Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs

From: Daniel Borkmann
Date: Mon Oct 19 2015 - 05:51:58 EST


On 10/19/2015 09:36 AM, Hannes Frederic Sowa wrote:
Hi,

On Sun, Oct 18, 2015, at 22:59, Alexei Starovoitov wrote:
On 10/18/15 9:49 AM, Daniel Borkmann wrote:
Okay, I have pushed some rough working proof of concept here:

https://git.breakpoint.cc/cgit/dborkman/net-next.git/log/?h=ebpf-fds-final5

So the idea eventually had to be slightly modified after giving this
further
thoughts and is the following:

We have 3 commands (BPF_DEV_CREATE, BPF_DEV_DESTROY, BPF_DEV_CONNECT), and
related to that a bpf_attr extension with only a single __u32 fd member
in it.
...
The nice thing about it is that you can create/unlink as many as you
want, but
when you remove the real device from an application via
bpf_dev_destroy(fd),
then all links disappear with it. Just like in the case of a normal
device driver.

interesting idea!
What happens if user app creates a dev via bpf_dev_create(), exits and
then admin does rm of that dev ?
Looks like map/prog will leak ?
So the only proper way to delete such cdevs is via bpf_dev_destroy ?

The mknod is not the holder but rather the kobject which should be
represented in sysfs will be. So you can still get the map major:minor
by looking up the /dev file in the correspdonding sysfs directory or I
think we should provide a 'unbind' file, which will drop the kobject if
the user writes a '1' to it.

I agree, this could still be done.

On device creation, the kernel will return the minor number via bpf(2),
so you
can access the file easily, f.e. /dev/bpf/bpf_map<minor> resp.
/dev/bpf/bpf_prog<minor>,
and then move on with mknod(2) or symlink(2) from there if wished.

what if admin mknod in that dir with some arbitrary minor ?

Basically, -EIO. :)

mknod will succeed, but it won't hold anything?

That is right now true for basically all mknod operations, which udev
creates.

looks like bpf_dev_connect will handle it gracefully.
So these cdevs should only be created and destroyed via bpf syscall
and only sensible operations on them is open() to get fd and pass
to bpf_dev_connect and symlink. Anything else admin should be
careful not to do. Right?

Besides maybe some statistics and other stuff in sysfs directory, no,
that is all.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/