Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs

From: Alexei Starovoitov
Date: Mon Oct 19 2015 - 12:22:22 EST


On 10/19/15 7:23 AM, Daniel Borkmann wrote:
The mknod is not the holder but rather the kobject which should be
represented in sysfs will be. So you can still get the map major:minor
by looking up the /dev file in the correspdonding sysfs directory or I
think we should provide a 'unbind' file, which will drop the kobject if
the user writes a '1' to it.

I agree, this could still be done.

imo doing 'rm' is way cleaner then dealing with 'unbind' file.

As Hannes said, under /sys/class/bpf/ an admin can see all held nodes, so
visibility is there for free at all times. The device management (creation/
deletion) itself and the mknod's pointing to it are simply decoupled.

This whole approach looks sound to me, also integrates nicely into the
existing Linux facilities, and works on top of every fs supporting special
files. Much cleaner than an extra file-system that would be required by a
syscall in order to make the syscall work.

thanks for the explanations. I think I got a complete picture now on
how such cdev will be used and I don't like it.
There is nothing in linux or any unix that creates thousands of cdevs
on the fly, but here user apps will create/destroy them back and forth
and they would need to do it quickly. Whole sysfs/kobj baggage is
completely unnecessary here. The kernel will consume more memory for
no real reason other than cdev are used to keep prog/maps around.
imo fs is cleaner and we can tailor it to be similar to cdev style.
For example we can make bpffs automount in /sys/kernel/bpf/ as standard
location and have one directory structure for all mounts (like tracefs).
Then within it have idr mechanism to crate bpf_progX and bpf_mapY
special files via BPF_PIN_FD bpf syscall with single FD argument.
At this point fs and cdev approach from user point of view look
exactly the same, but overhead of fs is significantly lower,
normal 'rm' works just fine and much faster.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/