Re: Merging of completely unreviewed drivers

From: Ingo Molnar
Date: Fri Feb 22 2008 - 13:54:43 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> I'm personally of the opinion that a lot of checkpatch "fixes" are
> anything but. That mainly concerns fixing overlong lines (where the
> "fixed" version is usually worse than the original), but it's been
> true for some other warnings too.

that was certainly the case for the earlier checkpatch releases which
treated overlong lines as an error.

So here's a quick list of negative and positive aspects of current
versions of checkpatch, as i see them.

But let me first declare it that when scripts/checkpatch.pl was
initially merged last year i immediately ran it over my own files and
became a deep sceptic of it. (check the lkml archives, i complained alot
about it)

Now i've got more than half a year of experience with using checkpatch
as an integral part of scheduler maintenance, and we've now got 4 months
of experience with using checkpatch in arch/x86 maintenance.

Based on this first hand experience, my opinion about checkpatch has
changed, rather radically: i now believe that checkpatch is almost as
important to the long term health of our kernel development process as
BitKeeper/Git turned out to be. If i had to stop using it today, it
would be almost as bad of a step backwards to me as if we had to migrate
the kernel source code control to CVS.

Lets see the Bad Side of checkpatch:

1) checkpatch "errors" shouldnt be taken too seriously for newly
introduced "leaf" driver code, which code we dont at all know
whether we'll be maintaining in any serious manner in the future.
Slowing down a submission by requirig it to pass checkpatch is not
as clear-cut as it is for core infrastructure and architecture code.
It's far more important to get _any_ code to users (as long as it's
not outright harmful) than to nitpick about style details.

2) it still has some false positives. (They are quite rare in the
latest versions, about 1 out of 100 for code that is already
"clean". I send them over to Andy whenever i see them, and they get
fixed quickly. The false positives were a big annoyance in early
checkpatch.pl versions, these days they are not - to me at least.)

3) it's _really_ annoying when sometimes i stumble over some old,
crufty piece of code that according to checkpatch is in high need of
some good, thorough cleanup - and when i take a look at the code it
turns out that the original author of that crap piece of code turns
out to be ... me. Those moments can be pretty embarrasing and
sobering ;-)

The Good Side of checkpatch (and here i'll only list the non-obvious
advantages):

1) 90% of the scheduler related checkpatch fixes today you'll never
recognize in a commit! The fixes all happen before code is
submitted, and the fixes are seemlessly embedded in nice looking
patches. (in that sense checkpatch is a bit like lockdep: 90% of the
errors they detect wont hit lkml, ever.)

2) you might know that Deja-Vu moment when you look at a new patch that
has been submitted to lkml and you have a strange, weird "feeling"
that there's something wrong about the patch.

It's totally subconscious, and you take a closer look and a few
seconds later you find a real bug in the code.

That "feeling" i believe comes from a fundamental property of how
human vision is connected to the human brain: pattern matching.
Really good programmers have built a "library" of patterns of "good"
and "bad" looking coding practices.

If a patch or if a file has a clean _style_, bugs and deeper
structural problems often stand out like a sore thumb. But if the
code is peppered with random style noise, it's a lot harder (for me
at least) to notice real bugs. I can notice bugs in a squeeky clean
code base about 5 times easier than in a noisy codebase. This effect
alone makes checkpatch indispensible for the scheduler and for
arch/x86.

Sidenote: i dont really need fancy metrics trying to tell me how
good an algorithm _truly_ is (although it certainly would be
interesting to have). I can _see_ that at a glance - provided the
code follows common kernel practices and a common, consistent style.
Checkpatch makes visual code patterns universal and eases the human
maintainance work enormously, for a 150+ KLOC subsystem like
arch/x86. I'm not distracted (visually and mentally) by the thick
fog of small silly details and quirks in coding style. Others might
have radar eyes and radar brains, i dont :-)

3) checkpatch also keeps _my_ bugs out of the kernel in an interesting
way. I'm sure many of you are like me: i've got "weaker" moments
when i write rather crappy code, and i've got "stronger" moments
when i'm in the flow and can write a few thousand lines of code with
nary a hickup. What makes things worse is it's really hard to tell
the two apart.

It turns out - and this surprised me a lot - that when i write new
code that is "weaker", i tend to make more "style mistakes", without
noticing them. Later on, when i do a checkpatch run, i see some
weird looking code and find that it's also buggy!

This concept also works with code written by others: when i get a
careless patch written in a hurry, it is much more likely to have
style errors in it, and as a maintainer i'm warned about that fact.

The best programmers are the ones who have a good eye for details -
and that subconsciously extends to "style details" too. I've yet to
see a _single_ example of a good, experienced kernel programmer who
writes code that looks absolutely careless and sloppy, but which is
top-notch otherwise. (Newbies will make style mistakes a lot more
often - and for them checkpatch is a nice and easy experience at
reading other people's code and trying to learn the style of the
kernel.)

4) there's a psychological effect as well: clean _looking_ code is
more attractive to coders to improve upon. Once the code _looks_
clean (mechanically), the people with the real structural cleanups
are not far away either. Code that just looks nice is simply more of
a pleasure to work with and to improve, so there's a strong
psychological relationship between the "small, seemingly unimportant
details" cleanups and the real, structural cleanups.

On the other hand, bad looking, unaesthetic code is avoided by
kernel developers like the pest. That is a constant skewing force
that is very harmful to Linux, because the "current style" of
subsystems is a pretty random property at the moment, and there are
_many_ important codebases in the kernel that are avoided by most of
us purely just because they look so awful.

5) cleanups were rather hard to get upstream before, because there was
never any true "objective basis" for the cleanups, giving an easy
excuse for flames over stupid taste differences, and making it easy
for maintainers to reject 90%-good cleanups just based on taste
differences. Checkpatch gives the right tool to people to write
consistently clean code and makes it harder for maintainers to find
the arguments to keep keep code unclean.

After this list of rather subjective impressions, i've also got some
historic raw data as well about how arch/x86 cleanups progressed over
the past 4 months.

( NOTE: the "errors" count below does _not_ include "lines longer than
80 chars" warnings nor any of the other checkpatch warnings - only
checkpatch "errors" which are real bona fide style errors in 99%+ of
the cases. )

errors lines of code errors/KLOC
........................................................................
v2.6.24-rc1 arch/x86/ [23 Oct 2007] 8695 117423 74.0
v2.6.24-x86.git arch/x86/ [21 Nov 2007] 5190 117156 44.2
v2.6.24-x86.git arch/x86/ [18 Dec 2007] 4057 117213 34.6
v2.6.24-x86.git arch/x86/ [ 8 Jan 2008] 3650 117987 30.9
v2.6.24-x86.git arch/x86/ [ 4 Feb 2008] 3334 133542 24.9
v2.6.25-x86.git arch/x86/ [21 Feb 2008] 2724 136963 19.8

[ See: http://redhat.com/~mingo/x86.git/code-quality - although i guess
i should rename it to "style-quality" - because there is no direct
mapping of style quality to real code quality. NOTE: some of the
reductions in the error count above are mechanic from things like
really long arrays or the math-emu changes - but most of the real
reductions are genuine. ]

v2.6.24-rc1 was the raw arch/x86 code how we inherited it after we did
the mechanic unification without changing any of the files. After that
point you can see a marked reduction in the total count of style errors.

While many of the fixes are just small details and may all seem
insignificant in isolation, IMO the sum of those small details matters
_a lot_: in the past 4 months the code has become a lot more hackable to
us and that process was driven in large part by checkpatch.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/