Re: [PATCH bpf-next 2/4] bpf: Implement bpf_local_storage for inodes

From: KP Singh
Date: Tue May 26 2020 - 22:11:18 EST


Thanks for taking a look!

On 26-May 17:49, Alexei Starovoitov wrote:
> On Tue, May 26, 2020 at 06:33:34PM +0200, KP Singh wrote:
> >
> > +static struct bpf_local_storage_data *inode_storage_update(
> > + struct inode *inode, struct bpf_map *map, void *value, u64 map_flags)
> > +{
> > + struct bpf_local_storage_data *old_sdata = NULL;
> > + struct bpf_local_storage_elem *selem;
> > + struct bpf_local_storage *local_storage;
> > + struct bpf_local_storage_map *smap;
> > + int err;
> > +
> > + err = check_update_flags(map, map_flags);
> > + if (err)
> > + return ERR_PTR(err);
> > +
> > + smap = (struct bpf_local_storage_map *)map;
> > + local_storage = rcu_dereference(inode->inode_bpf_storage);
> > +
> > + if (!local_storage || hlist_empty(&local_storage->list)) {
> > + /* Very first elem for this inode */
> > + err = check_flags(NULL, map_flags);
> > + if (err)
> > + return ERR_PTR(err);
> > +
> > + selem = selem_alloc(smap, value);
> > + if (!selem)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + err = inode_storage_alloc(inode, smap, selem);
>
> inode_storage_update looks like big copy-paste except above one line.
> pls consolidate.

Sure.

>
> > +BPF_CALL_4(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode,
> > + void *, value, u64, flags)
> > +{
> > + struct bpf_local_storage_data *sdata;
> > +
> > + if (flags > BPF_LOCAL_STORAGE_GET_F_CREATE)
> > + return (unsigned long)NULL;
> > +
> > + sdata = inode_storage_lookup(inode, map, true);
> > + if (sdata)
> > + return (unsigned long)sdata->data;
> > +
> > + if (flags == BPF_LOCAL_STORAGE_GET_F_CREATE &&
> > + atomic_inc_not_zero(&inode->i_count)) {
> > + sdata = inode_storage_update(inode, map, value, BPF_NOEXIST);
> > + iput(inode);
> > + return IS_ERR(sdata) ?
> > + (unsigned long)NULL : (unsigned long)sdata->data;
> > + }
>
> This is wrong. You cannot just copy paste the refcounting logic
> from bpf_sk_storage_get(). sk->sk_refcnt is very different from inode->i_count.
> To start, the inode->i_count cannot be incremented without lock.

Good catch! Agreed, Jann pointed out that this can lead to bugs
similar to https://crbug.com/project-zero/2015.

> If you really need to do it you need igrab().
> Secondly, the iput() is not possible to call from bpf prog yet, since

> progs are not sleepable and iput() may call iput_final() which may sleep.

Agreed, I will send a separate patch to add a might_sleep call to
iput() which currently only has a "Consequently, iput() can sleep."
warning in the comments so that this can be caught by
CONFIG_DEBUG_ATOMIC_SLEEP.

> But considering that only lsm progs from lsm hooks will call bpf_inode_storage_get()
> the inode is not going to disappear while this function is running.

If the inode pointer is an argument to the LSM hook, it won't
disappear and yes this does hold generally true for the other
use-cases as well.

> So why touch i_count ?
>
> > +
> > + return (unsigned long)NULL;
> > +}
> > +
> > BPF_CALL_2(bpf_sk_storage_delete, struct bpf_map *, map, struct sock *, sk)
> > {
> > if (refcount_inc_not_zero(&sk->sk_refcnt)) {
> > @@ -957,6 +1229,20 @@ BPF_CALL_2(bpf_sk_storage_delete, struct bpf_map *, map, struct sock *, sk)
> > return -ENOENT;
> > }
> >
> > +BPF_CALL_2(bpf_inode_storage_delete,
> > + struct bpf_map *, map, struct inode *, inode)
> > +{
> > + int err;
> > +
> > + if (atomic_inc_not_zero(&inode->i_count)) {
> > + err = inode_storage_delete(inode, map);
> > + iput(inode);
> > + return err;
> > + }
>
> ditto.
>
> > +
> > + return inode_storage_delete(inode, map);
>
> bad copy-paste from bpf_sk_storage_delete?
> or what is this logic suppose to do?

The former :) fixed...

- KP