Re: [RFC PATCH 01/11] counters: Introduce counter and counter_atomic

From: Kees Cook
Date: Wed Sep 23 2020 - 16:54:43 EST


On Wed, Sep 23, 2020 at 09:34:48PM +0200, Greg KH wrote:
> On Wed, Sep 23, 2020 at 12:04:08PM -0700, Kees Cook wrote:
> > On Tue, Sep 22, 2020 at 07:43:30PM -0600, Shuah Khan wrote:
> > > Introduce Simple atomic and non-atomic counters.
> > >
> > > There are a number of atomic_t usages in the kernel where atomic_t api
> > > is used strictly for counting and not for managing object lifetime. In
> > > some cases, atomic_t might not even be needed.
> >
> > Thank you for working on a counter API! I'm glad to see work here,
> > though I have some pretty significant changes to request; see below...
> >
> > >
> > > The purpose of these counters is twofold: 1. clearly differentiate
> > > atomic_t counters from atomic_t usages that guard object lifetimes,
> > > hence prone to overflow and underflow errors. It allows tools that scan
> > > for underflow and overflow on atomic_t usages to detect overflow and
> > > underflows to scan just the cases that are prone to errors. 2. provides
> > > non-atomic counters for cases where atomic isn't necessary.
> > >
> > > Simple atomic and non-atomic counters api provides interfaces for simple
> > > atomic and non-atomic counters that just count, and don't guard resource
> > > lifetimes. Counters will wrap around to 0 when it overflows and should
> > > not be used to guard resource lifetimes, device usage and open counts
> > > that control state changes, and pm states.
> > >
> > > Using counter_atomic to guard lifetimes could lead to use-after free
> > > when it overflows and undefined behavior when used to manage state
> > > changes and device usage/open states.
> > >
> > > Signed-off-by: Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>
> >
> > I would really like these APIs to be _impossible_ to use for object
> > lifetime management. To that end, I would like to have all of the
> > *_return() functions removed. It should be strictly init, inc, dec,
> > read.
> >
> > > +There are a number of atomic_t usages in the kernel where atomic_t api
> > > +is used strictly for counting and not for managing object lifetime. In
> > > +some cases, atomic_t might not even be needed.
> >
> > Why even force the distinction? I think all the counters should be
> > atomic and then there is no chance they will get accidentally used in
> > places where someone *thinks* it's safe to use a non-atomic. So,
> > "_atomic" can be removed from the name and the non-atomic implementation
> > can get removed. Anyone already using non-atomic counters is just using
> > "int" and "long" anyway. Let's please only create APIs that are always
> > safe to use, and provide some benefit over a native time.
>
> For "statistics", why take the extra overhead for an atomic variable
> just to be able to show to a debugging file the number of USB packets
> have been sent through the system (a current use of an atomic variable
> for some odd reason...)
>
> And really, a "int" should be pretty safe to write from multiple places,
> you aren't going to get "tearing" on any processors that run Linux,
> worst case you get a stale value when reading them.
>
> So I would argue that the default for a counter be just an int, not
> atomic, as odds are, most atomics are not really needed for this type of
> thing at all.

If the atomicity isn't needed, then they can just use an int. ;)

I think the _counter_ type should be robust. We're specifically looking
at replacing the users who are already using atomic_t for counting. The
idea is to separate all the atomic_t doing ref counting into refcount_t
and all the atomic_t doing statistics into "struct counter", and then
what's left can meaningfully be reasoned about. i.e. "why is this a raw
atomic)t?"

But creating "struct counter" with a non-atomic API doesn't make sense
to me. And it certainly doesn't make sense for replacing existing
atomic_t statistics use cases.

--
Kees Cook