RE: [PATCH v3 3/6] x86/gsseg: make asm_load_gs_index() take an u16
From: H. Peter Anvin
Date: Mon Oct 17 2022 - 14:40:31 EST
On October 17, 2022 12:49:41 AM PDT, David Laight <David.Laight@xxxxxxxxxx> wrote:
>From: H. Peter Anvin
>> Sent: 15 October 2022 03:41
>>
>> On October 14, 2022 5:28:25 AM PDT, David Laight <David.Laight@xxxxxxxxxx> wrote:
>> >From: Xin Li
>> >> Sent: 13 October 2022 21:02
>> >>
>> >> From: "H. Peter Anvin (Intel)" <hpa@xxxxxxxxx>
>> >>
>> >> Let gcc know that only the low 16 bits of load_gs_index() argument
>> >> actually matter. It might allow it to create slightly better
>> >> code. However, do not propagate this into the prototypes of functions
>> >> that end up being paravirtualized, to avoid unnecessary changes.
>> >
>> >Using u16 will almost always make the code worse.
>> >At some point the value has to be masked and/or extended
>> >to ensure an out of range value doesn't appear in
>> >a register.
>> >
>> > David
>>
>> Is that a general statement or are you actually invoking it in this case?
>> This is about it being a narrowing input, *removing* such constraints.
>
>It is a general statement.
>You suggested you might get better code.
>If fact you'll probably get worse code.
>It might not matter here, but ...
>
>Most modern calling conventions use cpu register to pass arguments
>and results.
>So the compiler is required to ensure that u16 values are in range
>in either the caller or called code (or both).
>Just because the domain of a value is small doesn't mean that
>the best type isn't 'int' or 'unsigned int'.
>
>Additionally (except on x86) any arithmetic on sub-32bit values
>requires additional instructions to mask the result.
>
>Even on x86-64 if you index an array with an 'int' the compiler
>has to generate code to sign extend the value to 64 bits.
>You get better code for 'signed long' or unsigned types.
>This is probably true for all 64bit architectures.
>
>Since (most) cpu have both sign extending an zero extending
>loads from memory, it can make sense to use u8 and u16 to
>reduce the size of structures.
>But for function arguments and function locals it almost
>always makes the code worse.
>
> David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
>Registration No: 1397386 (Wales)
>
Ok. You are plain incorrect in this case for two reasons:
1. The x86-64 calling convention makes it up to the receiver (callee for arguments, caller for returns) to do such masking of values.
2. The consumer of the values here does not need any masking or extensions.
So this is simply telling the compiler what the programmer knows.