Re: [PATCH 00/13] [RFC] Rust support
From: Willy Tarreau
Date: Sat Apr 17 2021 - 07:47:08 EST
On Sat, Apr 17, 2021 at 01:17:21PM +0200, Peter Zijlstra wrote:
> Well, I think the rules actually make sense, at the point in the syntax
> tree where + happens, we have 'unsigned char' and 'int', so at that
> point we promote to 'int'. Subsequently 'int' gets shifted and bad
> things happen.
That's always the problem caused by signedness being applied to the
type while modern machines do not care about that and use it during
(or even after) the operation instead :-/
We'd need to define some macros to zero-extend and sign-extend some
values to avoid such issues. I'm sure this would be more intuitive
than trying to guess how many casts (and in what order) to place to
make sure an operation works as desired.
> The 'unsigned long' doesn't happen until quite a bit later.
>
> Anyway, the rules are imo fairly clear and logical, but yes they can be
> annoying. The really silly thing here is that << and >> have UB at all,
> and I would love a -fwrapv style flag that simply defines it. Yes it
> will generate worse code in some cases, but having the UB there is just
> stupid.
I'd also love to have a UB-less mode with well defined semantics for
plenty of operations that are known to work well on modern machines,
like integer wrapping, bit shifts ignoring higher bits etc. Lots of
stuff we often have to write useless code for, just to please the
compiler.
> That of course doesn't help your case here, it would simply misbehave
> and not be UB.
>
> Another thing the C rules cannot really express is a 32x32->64
> multiplication, some (older) versions of GCC can be tricked into it, but
> mostly it just doesn't want to do that sanely and the C rules are
> absolutely no help there.
For me the old trick of casting one side as long long still works:
unsigned long long mul3264(unsigned int a, unsigned int b)
{
return (unsigned long long)a * b;
}
i386:
00000000 <mul3264>:
0: 8b 44 24 08 mov 0x8(%esp),%eax
4: f7 64 24 04 mull 0x4(%esp)
8: c3 ret
x86_64:
0000000000000000 <mul3264>:
0: 89 f8 mov %edi,%eax
2: 89 f7 mov %esi,%edi
4: 48 0f af c7 imul %rdi,%rax
8: c3 retq
Or maybe you had something else in mind ?
Willy