Re: Style question: comparison between signed and unsigned?

Bodo Moeller (Bodo_Moeller@public.uni-hamburg.de)
Tue, 23 Sep 97 14:29 GMT+0200


Linus Torvalds <torvalds@transmeta.com>:

> int i = read(socket, buffer, sizeof(buffer));
> if (i == -1) return i;
> if (i < sizeof(struct pkthdr))
> return SHORT_PACKET;

> The PROBLEM is that if gcc warns about signed/unsigned comparisons
> (which some versions of gcc do), gcc will totally needlessly [...]
> warn about the second test [...] because "i" is signed, but "sizeof"
> is unsigned.
> To get rid of that _spurious_ warning, I'd have to change the test
> into something like:

> if (i < (int) sizeof(struct pkthdr))
> return SHORT_PACKET;

> Quite frankly, anybody who claims that the extra cast is a good thing
> should be shot on the spot

That's the wrong cast [1]. Unless you want to do without a cast, that
piece of code should look like this:

if ((size_t) i < sizeof(struct pkthdr))
return SHORT_PACKET;

This is also what happens if there is no explicit cast: According to
the C balancing rules, when you write i < sizeof(struct pkthr), the
int is automatically converted to the unsigned type size_t. Writing
(size_t) i documents to both human readers and stupid compilers that
you are aware of this conversion, and that you have ensured that it
does not cause any problems (because i cannot be negative when that
statement is reached).

[1] Although, of course, here it wouldn't cause any real problems.
If sizeof(struct pkthdr) was too large for a signed int, we'd
have other things to worry about :-)

> Type-casts are the _single_ most dangerous thing you can do in C,
> and a compiler that _encourages_ you to add spurious type casting
> for no gain is a bad compiler.

It is true that type-casts can be dangerous, but so can the lack
thereof [2] and the implicit type conversion rule that C obeys when
there is no type cast. In your above example, the type cast in
(size_t) i < sizeof(struct pkthdr)
makes explicit that this comparison uses i in a possibly
non-intuitive way: Otherwise, a careless programmer might decide to
"optimize" your code to

int i = read(socket, buffer, sizeof(buffer));
if (i < sizeof(struct pkthdr)) /* read() failed or short packet */
return SOME_ERROR_HAPPENED;

which of course doesn't do.

[2] A classical example is
int a = some_quite_large_number;
int b = another_large_number;
long c = a + b;
-- here we _need_ a cast, we should write
long c = (long) a + (long) b;
(alternatively, one might prefer to leave out one of these
type casts). This does not really have much to do with the
discussion in this thread, but it shows that even the most
innocuous code sometimes needs a type cast.

On the other hand, we should consider the _dangers_ of

(size_t) i < sizeof(struct pkthdr).

What we have to fear here is that this expression will be rendered
incorrect if the declaration of i is changed to, e.g., "long i".
Similar to your "i < (int) sizeof(...)" proposal, this incorrectness
is not likely to cause real problems -- after all, i is the return
value of a read() and thus fits into an int --, but to the very least
it's bad style. This side-effect might go unnoticed by whoever
changes i's declaration. (Some people might like to have something
on the lines of "assert(sizeof(i) <= sizeof size_t);" before the
comparison, which could help to avoid that problem.)

In order to weigh
i < sizeof(struct pkthdr)
against
(size_t) i < sizeof(struct pkthdr)
and to evaluate the risks of both (that is, the risk that later
changes of the code will lead to problems), we'll have to ask how
likely it is in each of these cases that someone will change the code
without respecting the respective pitfalls.
(I won't discuss weighing the effort spent on these analyses against
the effort going in actual programming :-)

The first failure scenario
int i = read(socket, buffer, sizeof(buffer));
if (i < sizeof(struct pkthdr)) /* read() failed or short packet */
return SOME_ERROR_HAPPENED;
is plausible, if we assume that many C programmers are _not_ instantly
aware of the conversion of both values to unsigned when they see a
comparison of that form. (You could argue that you don't want such
programmers to tamper with other than toy programs, anyway.)

The second failure scenario
long i = .... whatever ....;
....;
if ((size_t) i < sizeof(struct pkthdr))
return SHORT_PACKET;
is IMHO less plausible, because this could only happen if the other
code has evolved a lot (whilst the comparison is not touched): E.g.,
the read() likely has disappeared at some stage -- otherwise i
probably wouldn't be a _long_ int.

My conclusion is that in this particular example, the type cast
(size_t) i seems to have merit. In general, I agree that many type
casts should be cast aside (and those who abuse type casts for some
"a pointer is an int is a long" programming style should be cast down)
-- every type cast might hide a cast-off (but justified) compiler
warning. But sometimes even those type casts that are not required by
the C programming language can be advantageous.

Bodo M"oller
<Bodo_Moeller@public.uni-hamburg.de>