RE: [PATCH 08/17] net: convert sk_filter.refcnt from atomic_t to refcount_t

From: Reshetova, Elena
Date: Fri Mar 17 2017 - 04:14:24 EST



> On 03/16/2017 04:28 PM, Elena Reshetova wrote:
> > refcount_t type and corresponding API should be
> > used instead of atomic_t when the variable is used as
> > a reference counter. This allows to avoid accidental
> > refcounter overflows that might lead to use-after-free
> > situations.
> >
> > Signed-off-by: Elena Reshetova <elena.reshetova@xxxxxxxxx>
> > Signed-off-by: Hans Liljestrand <ishkamiel@xxxxxxxxx>
> > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
> > Signed-off-by: David Windsor <dwindsor@xxxxxxxxx>
> > ---
> > include/linux/filter.h | 3 ++-
> > net/core/filter.c | 7 ++++---
> > 2 files changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/filter.h b/include/linux/filter.h
> > index 8053c38..20247e7 100644
> > --- a/include/linux/filter.h
> > +++ b/include/linux/filter.h
> > @@ -7,6 +7,7 @@
> > #include <stdarg.h>
> >
> > #include <linux/atomic.h>
> > +#include <linux/refcount.h>
> > #include <linux/compat.h>
> > #include <linux/skbuff.h>
> > #include <linux/linkage.h>
> > @@ -431,7 +432,7 @@ struct bpf_prog {
> > };
> >
> > struct sk_filter {
> > - atomic_t refcnt;
> > + refcount_t refcnt;
> > struct rcu_head rcu;
> > struct bpf_prog *prog;
> > };
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index ebaeaf2..62267e2 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -928,7 +928,7 @@ static void sk_filter_release_rcu(struct rcu_head *rcu)
> > */
> > static void sk_filter_release(struct sk_filter *fp)
> > {
> > - if (atomic_dec_and_test(&fp->refcnt))
> > + if (refcount_dec_and_test(&fp->refcnt))
> > call_rcu(&fp->rcu, sk_filter_release_rcu);
> > }
> >
> > @@ -950,7 +950,7 @@ bool sk_filter_charge(struct sock *sk, struct sk_filter *fp)
> > /* same check as in sock_kmalloc() */
> > if (filter_size <= sysctl_optmem_max &&
> > atomic_read(&sk->sk_omem_alloc) + filter_size <
> sysctl_optmem_max) {
> > - atomic_inc(&fp->refcnt);
> > + refcount_inc(&fp->refcnt);
> > atomic_add(filter_size, &sk->sk_omem_alloc);
> > return true;
> > }
> > @@ -1179,12 +1179,13 @@ static int __sk_attach_prog(struct bpf_prog *prog,
> struct sock *sk)
> > return -ENOMEM;
> >
> > fp->prog = prog;
> > - atomic_set(&fp->refcnt, 0);
> > + refcount_set(&fp->refcnt, 1);
> >
> > if (!sk_filter_charge(sk, fp)) {
> > kfree(fp);
> > return -ENOMEM;
> > }
> > + refcount_set(&fp->refcnt, 1);
>
> Regarding the two subsequent refcount_set(, 1) that look a bit strange
> due to the sk_filter_charge() having refcount_inc() I presume ... can't
> the refcount API handle such corner case?

Yes, it was exactly because of recount_inc() from zero in sk_filter_charge().
refcount_inc() would refuse to do an inc from zero for security reasons. At some
point in past we discussed refcount_inc_not_one() but it was decided to be too special case
to support (we really have very little of such cases).


Or alternatively the let the
> sk_filter_charge() handle it, for example:
>
> bool __sk_filter_charge(struct sock *sk, struct sk_filter *fp)
> {
> u32 filter_size = bpf_prog_size(fp->prog->len);
>
> /* same check as in sock_kmalloc() */
> if (filter_size <= sysctl_optmem_max &&
> atomic_read(&sk->sk_omem_alloc) + filter_size <
> sysctl_optmem_max) {
> atomic_add(filter_size, &sk->sk_omem_alloc);
> return true;
> }
> return false;
> }
>
> And this goes to filter.h:
>
> bool __sk_filter_charge(struct sock *sk, struct sk_filter *fp);
>
> bool sk_filter_charge(struct sock *sk, struct sk_filter *fp)
> {
> bool ret = __sk_filter_charge(sk, fp);
> if (ret)
> refcount_inc(&fp->refcnt);
> return ret;
> }
>
> ... and let __sk_attach_prog() call __sk_filter_charge() and only fo
> the second refcount_set()?
>
> > old_fp = rcu_dereference_protected(sk->sk_filter,
> >
> lockdep_sock_is_held(sk));
> >

Oh, yes, this would make it look less awkward. Thank you for the suggestion Daniel!
I guess we try to be less invasive for code changes overall, maybe even too careful...

I will update the patch and send a new version.

Best Regards,
Elena.