Re: [PATCH 01/10] x86: assembly, ENTRY for fn, GLOBAL for data
From: Ingo Molnar
Date: Tue Mar 07 2017 - 03:30:49 EST
* hpa@xxxxxxxxx <hpa@xxxxxxxxx> wrote:
> On March 1, 2017 2:27:54 AM PST, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >> > So how about using macro names that actually show the purpose, instead of
> >> > importing all the crappy, historic, essentially randomly chosen debug
> >> > symbol macro names from the binutils and older kernels?
> >> >
> >> > Something sane, like:
> >> >
> >> > SYM__FUNCTION_START
> >>
> >> Sane would be:
> >>
> >> SYM_FUNCTION_START
> >>
> >> The double underscore is just not giving any value.
> >
> > So the double underscore (at least in my view) has two advantages:
> >
> > 1) it helps separate the prefix from the postfix.
> >
> > I.e. it's a 'symbols' namespace, and a 'function start', not the 'start' of a
> > 'symbol function'.
> >
> > 2) It also helps easy greppability.
> >
> > Try this in latest -tip:
> >
> > git grep e820__
> >
> > To see all the E820 API calls - with no false positives!
> >
> > 'git grep e820_' on the other hand is a lot less reliable...
>
> IMO these little "namespace tricks" especially for small common macros like we
> are taking about here make the code very frustrating to read, and even more to
> write. Noone would design a programming language that way, and things like PROC
> are really just substitutes for proper language features (and could even be as
> assembly rather than cpp macros.)
This is a totally different thing from language keywords which needs to be short
and concise for obvious reasons.
Keywords of languages get nested and are used all the time, and everyone needs to
know them and they need to stay out of the way. The symbol start/end macros we are
talking about here are _MUCH_ less common, and they are only ever used in a single
nesting level:
SYM__FUNC_START(some_kernel_asm_function)
...
SYM__FUNC_END(some_kernel_asm_function)
Most kernel developers writing new assembly code rarely know these constructs by
heart, they just look them up and carbon copy existing practices. And guess what,
the 'looking them up' gets harder if the macro naming scheme is an idosyncratic
leftover from long ago.
Kernel developers _reading_ assembly code will know the exact purpose of the
macros even less, especially if they are named in an ambiguous, illogical fashion.
Furthermore, your suggestion of:
> PROC..ENDPROC, LOCALPROC..ENDPROC and DATA..ENDDATA. Clear, unambiguous and
> balanced.
Are neither clear, not unambiguous nor balanced! I mean, they are the _exact_
opposite:
- 'PROC' is actually ambiguous in the kernel source code context, as it clashes
with common abbreviations of 'procfs' and 'process'.
It's also an unnecessary abbreviation of a word ('procedure') that is not
actually used a _single time_ in the C ISO/IEC 9899:TC2 standard - in all half
thousand+ pages of it. (!) Why the hell does this have to be used in the
kernel?
- It's visually and semantically imbalanced, because some terms have an 'END'
prefix, but there's no matching 'START' or 'BEGIN' prefix for their
counterparts. This makes it easy to commit various symbol definition
termination errors, like:
PROC(some_kernel_asm_function)
...
Here it's not obvious that it needs an ENDPROC. While if it's written as:
SYM__FUNC_START(some_kernel_asm_function)
...
... it's pretty obvious at first sight that an '_END' is missing!
- What you suggest also has senselessly missing underscores, which makes it
_more_ cluttered and less clear. We clearly don't have addtowaitqueue() and
removefromwaitqueue() function names in the kernel, right? Why should we have
'ENDPROC' and 'ENDDATA' macro names?
- Hierarchical naming schemes generally tend to put the more generic category
name first, not last. So it's:
mutex_init()
mutex_lock()
mutex_unlock()
mutex_trylock()
It's _NOT_ the other way around:
init_mutex()
lock_mutex()
unlock_mutex()
trylock_mutex()
The prefix naming scheme is easier to read both visually/typographically
(because it aligns vertically in a natural fashion so it's easier to pattern
match), and also semantically: because when reading it it's easy to skip the
line once your brain reads the generic category of 'mutex'.
But with 'ENDPROC' my brain both has to needlessly perform the following steps:
- disambiguate the 'END' and the 'PROC'
- fill in the missing underscore
- and finally when arriving at the generic term 'PROC', discard it as
uninteresting
- Short names have good use in programming languages, because everyone who uses
that language knows what they are and they become a visual substitute for the
language element.
But assembly macros are _NOT_ a new language in this sense, they are actually
more similar to library function names: where brevity is actually
counterintuitive and harmful, because they are ambiguous and pollute the
generic namespace. If you look at C library API function name best practices
you'll see that the best ones are all hierarchically named and categorized,
with the more generic category put first, they are unambiguously balanced even
if that makes the names longer, they are clear and use underscores.
For all these reasons the naming scheme you suggest is one of the worst we could
come up with! I mean, if I had to _intentionally_ engineer something as harmful as
possible to readability and maintainability this would be pretty close to it...
I'm upset, because even a single minute of reflection should have told you all
this. I mean, IMHO it's not even a close argument: your suggested naming scheme is
bleeding from half a dozen of mortal wounds...
I can be convinced to drop the double underscores (I seem to be in the minority
regard them), and I can be convinced that 'FUNC' is shorter and still easy to
understand instead of 'FUNCTION', but other than that please stop the naming
madness!
Thanks,
Ingo