Re: [PATCH] kbuild: treat char as always signed

From: Gabriel Paubert
Date: Thu Oct 20 2022 - 08:35:27 EST


On Wed, Oct 19, 2022 at 11:11:16AM -0700, Linus Torvalds wrote:
> On Wed, Oct 19, 2022 at 10:45 AM Segher Boessenkool
> <segher@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > When I did this more than a decade ago there indeed was a LOT of noise,
> > mostly caused by dubious code.
>
> It really happens with explicitly *not* dubious code.

Indeed.

[snip]
> The "-Wpointer-sign" thing could probably be fairly easily improved,
> by just recognizing that things like 'strlen()' and friends do not
> care about the sign of 'char', and neither does a 'strcmp()' that only
> checks for equality (but if you check the *sign* of strcmp, it does
> matter).

I must miss something, the strcmp man page says:

"The comparison is done using unsigned characters."

But it's not for this that I wrote this message. Has anybody considered
using transparent unions?

They've been heavily used by userland networking code to pass pointer to
sockets, and they work reasonably well in that context IMHO.

So a very wild idea might to make string handling functions accept
transparent union of "char *" and "unsigned char *".

I've not even tried to write any code in this direction, so it's very
likely that this idea won't fly, and it clearly does not solve all
problems. It also probably needs a lot of surgery to avoid clashing with
GCC builtins and unfortunately lose some optimizations.

Gabriel

>
> It's been some time since I last tried it, but at least from memory,
> it really was mostly the standard C string functions that caused
> almost all problems. Your *own* functions you can just make sure the
> signedness is right, but it's really really annoying when you try to
> be careful about the byte signs, and the compiler starts complaining
> just because you want to use the bog-standard 'strlen()' function.
>
> And no, something like 'ustrlen()' with a hidden cast is just noise
> for a warning that really shouldn't exist.
>
> So some way to say 'this function really doesn't care about the sign
> of this pointer' (and having the compiler know that for the string
> functions it already knows about anyway) would probably make almost
> all problems with -Wsign-warning go away.
>
> Put another way: 'char *' is so fundamental and inherent in C, that
> you can't just warn when people use it in contexts where sign really
> doesn't matter.
>
> Linus