Re: Profanity in the Linux Kernel?!?!?

Kai Henningsen (kaih@khms.westfalen.de)
17 Jun 1999 09:36:00 +0200


rhw@MemAlpha.CX (Riley Williams) wrote on 16.06.99 in <Pine.LNX.4.10.9906161737330.15704-100000@ps.cus.umist.ac.uk>:

> >> a misleading or ambiguous error message is worse than no
> >> error message at all in my experience.
>
> > Let me go on record as violently disagreeing, here.
>
> > A bad error message is still something you can track down. Even
> > if it's bad, you know that execution reaches the point where it
> > gets printed, and you can backtrack from there.
>
> If you can backtrack to the actual error, you're extremely lucky in my
> experience.

I do it all the time; so I doubt it's luck.

Of course, it helps if (read: it's essential that) you have source.

The basic principle is simple - you know (or can find out) where the error
message is generated; find out how you got there, and how you got *there*,
and so on, until you find the error.

No, it doesn't work always. But when it doesn't, that's not the fault of
the error message, but the fault of the error itself (for example, memory
corruption leading to an error much later, or timing-related problems -
especially all those problems that change behaviour under debugging).

> > In fact, in my experience, about a third of all error reports
> > are misleading anyway, even given the most careful messages, and
> > there's nothing you can do about that.

... because the situation just wasn't forseen.

> In my experience, error reports can be broken down into the following
> four categories:
>
> 1. Syntax errors, or errors of coding that prevent the program
> from compiling correctly. These are all detected straight away,
> and are the easiest errors to fix.
>
> 2. Linking errors, or errors that permit the program to compile,
> but prevent it from linking into a valid executable. These are
> also detected straight away, and are the next easiest errors
> to fix.

Also, most of these never come in the form of an error report - programmer
breaks it, programmer sees it, programmer fixes it, before any third party
ever gets involved.

> 3. Trapped logical errors, or errors for which the program checks
> at the point where the error may occur. Because they are trapped,
> these errors are relatively easy to fix as one knows exactly
> where to look for the error based on the error message. As a
> result, most programs over 6 months old have few of these left.

Also known as "this can't happen" errors.

Your middle sentence is quite contrary to my experience. Those errors tell
that something went wrong; they don't tell why something went wrong - the
programmer didn't *expect* it to go wrong that way, so he must now find
the error in his own logic. Sometimes, it's obvious. Sometimes, it's
extremely hard to find.

Especially if it's timing or input data related.

> 4. Untrapped logical errors, or errors for which no check is made
> by the program at the point where the error may occur. Because
> they are NOT trapped, these errors are extremely hard to locate.

Not necessarily. I've had it happen often enough that something just
breaks, and I go "oh. This must come from that point ... [look at source]
... yup, there's a bug."

I think it is a principal error to assume that difficulty in solving
errors is (other than trivially) related to type of error.

> Difficult to understand error messages are inevitably also old ones,
> usually ones that made sense when they were first written, but for
> which their original meaning no longer applies, and their appearance
> generally indicates an error in category (4) for which the only thing
> one can safely say is that the actual error will be nowhere near the
> error message.

That doesn't match my experience at all. In fact, IME, old error messages
that no longer apply are typically fairly trivial to fix.

> On the other hand, an error message that's easy to understand usually
> indicates an error in category (3) and thus one that's easily tracked
> down and fixed.

Also wrong IME.

I'd write more, but time is running out. Gotta go.

MfG Kai

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/