Re: [PATCH 01/13] seqnum_ops: Introduce Sequence Number Ops

From: Greg KH
Date: Thu Nov 12 2020 - 08:28:27 EST


On Wed, Nov 11, 2020 at 09:15:55PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 11, 2020 at 11:28:13AM -0700, Shuah Khan wrote:
> > On 11/11/20 10:50 AM, Peter Zijlstra wrote:
> > > On Wed, Nov 11, 2020 at 10:34:05AM -0700, Shuah Khan wrote:
> > >
> > > > Not sure what to make of the 6080 atomic_read()s and 3413
> > > > atomic_inc()s, some of which might be assuming uniqueness
> > > > guarantee.
> > >
> > > Well, clearly you just did: git grep atimic_{read,inc}() | wc -l and
> > > didn't look at the usage. Equally clearly there can be bugs. Also
> > > evidently much of those are not in fact sequence numbers.
> > >
> >
> > Looking at the usage and classifying which usages are sequence
> > numbers is part of may audit and we are covered. Your explanation
> > and this discussion helps with do a better audit of these usages.
>
> Auditing is fine, but I still don't see any point in actually having
> these wrapping types. It's all a waste of space and compile-time IMO.
>
> Neither this sequence counter, nor stat_t or whatever else bring any
> actual differences. They're pure wrappers without change in semantics.
>
> refcount_t is useful because it brought different semantics, it raises
> exceptions on invalid usage (wraps). But this is just pointless NOPs.
>
> So do your audit, but only introduce new types for things that actually
> have different semantics. If you do a patch and the generated code is
> 100% identical but you have many more lines of code, you've only made it
> worse.

I'm sorry, but as someone who reviews the second-most code in the
kernel, I have to disagree. If I see a "raw" atomic_t being used in a
driver, I then have to look up all instances of where that variable is
being used, to verify what they are using it for, why they are using,
and if all of the means they are really using it in the correct way.

Always remember that atomic_t is way down there on the "Rusty scale of
designing an API you can use properly" scale:
https://ozlabs.org/~rusty/ols-2003-keynote/img46.html

If I see a sequence_t variable (or whatever we end up calling it), then
I instantly KNOW what this is for, and that is is impossible to get it
wrong when using it as the API for that variable prevents it from being
misused in horrible ways (like setting it to a value and decrementing
it.)

If me, as a kernel developer, wants to add a sequence number to my
driver, yes, I can "open code" one using an atomic_t and get it right
(or just use a u64 like we do for uevents), but then when I go back and
look at the code in 5 years, I have to try to remember exactly what I
did and where it is used and try to ensure that no one changed it
incorrectly. Again, if this is a sequence_t, all of that goes away.

So this doesn't save codespace, or generated code, it saves mental
energy which is the most limited resource we have. We write code for
the developers first, the compiler and cpu second, in order to create
something that us developers can maintain for long periods of time.
Kernel code is not like perl (write once, modify never), but like laws
(write once, modify constantly).

Remember us poor maintainers, who are doing the reviewing, and the
junior developers, creating new drivers where they have to implement
common features/patterns and the people that come after us and curse our
name as they try to understand exactly what a specific atomic_t was
supposed to be doing. We want to make all of our lives easier, and this
type of api does just that.

thanks,

greg k-h