Re: [PATCH bpf-next v8 5/7] bpf: Implement bpf_local_storage for inodes

From: Martin KaFai Lau
Date: Tue Aug 18 2020 - 11:23:51 EST


On Tue, Aug 18, 2020 at 05:10:34PM +0200, KP Singh wrote:
>
>
> On 8/18/20 3:27 AM, Martin KaFai Lau wrote:
> > On Mon, Aug 03, 2020 at 06:46:53PM +0200, KP Singh wrote:
> >> From: KP Singh <kpsingh@xxxxxxxxxx>
> >>
> >> Similar to bpf_local_storage for sockets, add local storage for inodes.
> >> The life-cycle of storage is managed with the life-cycle of the inode.
> >> i.e. the storage is destroyed along with the owning inode.
> >>
> >> The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
> >> security blob which are now stackable and can co-exist with other LSMs.
> >>
> >> Signed-off-by: KP Singh <kpsingh@xxxxxxxxxx>
> >> ---
> >> include/linux/bpf_local_storage.h | 10 +
> >> include/linux/bpf_lsm.h | 21 ++
> >> include/linux/bpf_types.h | 3 +
> >> include/uapi/linux/bpf.h | 38 +++
> >> kernel/bpf/Makefile | 1 +
>
> [...]
>
> ata *inode_storage_lookup(struct inode *inode,
> >> + struct bpf_map *map,
> >> + bool cacheit_lockit)
> >> +{
> >> + struct bpf_local_storage *inode_storage;
> >> + struct bpf_local_storage_map *smap;
> >> + struct bpf_storage_blob *bsb;
> >> +
> >> + bsb = bpf_inode(inode);
> >> + if (!bsb)
> >> + return ERR_PTR(-ENOENT);
> > ERR_PTR is returned here...
> >
> >> +
> >> + inode_storage = rcu_dereference(bsb->storage);
> >> + if (!inode_storage)
> >> + return NULL;
> >> +
>
> [...]
>
> >> + kfree_rcu(local_storage, rcu);
> >> +}
> >> +
> >> +
> >> +static void *bpf_fd_inode_storage_lookup_elem(struct bpf_map *map, void *key)
> >> +{
> >> + struct bpf_local_storage_data *sdata;
> >> + struct file *f;
> >> + int fd;
> >> +
> >> + fd = *(int *)key;
> >> + f = fcheck(fd);
> >> + if (!f)
> >> + return ERR_PTR(-EINVAL);
> >> +
> >> + get_file(f);
> >> + sdata = inode_storage_lookup(f->f_inode, map, true);
> >> + fput(f);
> >> + return sdata ? sdata->data : NULL;
> > sdata can be ERR_PTR here and a few other cases below.
> >
> > May be inode_storage_lookup() should just return NULL.
>
> I think returning NULL is a better option. Thanks!
>
> >
> >> +}
> >> +
> >> +static int bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key,
> >> + void *value, u64 map_flags)
> >> +{
> >> + struct bpf_local_storage_data *sdata;
> >> + struct file *f;
> >> + int fd;
> >> +
> >> + fd = *(int *)key;
> >> + f = fcheck(fd);
> >> + if (!f)
> >> + return -EINVAL;
> >> +
> >> + get_file(f);> get_file() does atomic_long_inc() instead of atomic_long_inc_not_zero().
> > I don't see how that helps here. Am I missing something?
>
> You are right, this should not not be an fcheck followed by a get_file
> rather fcheck followed by get_file_rcu:
>
> #define get_file_rcu_many(x, cnt) \
> atomic_long_add_unless(&(x)->f_count, (cnt), 0)
> #define get_file_rcu(x) get_file_rcu_many((x), 1)
> #define file_count(x) atomic_long_read(&(x)->f_count)
>
> But there is an easier way than all of this and this is to use
> fget_raw which also calls get_file_rcu_many
> and ensures a non-zero count before getting a reference.
ic. Make sense.

There are fdget() and fdput() also which are used in bpf/syscall.c.