Re: [net PATCH] octeontx2-af: Fix marking couple of structure as __packed

From: Jacob Keller
Date: Tue Dec 19 2023 - 16:06:44 EST




On 12/19/2023 7:26 AM, David Laight wrote:
> From: Jacob Keller
>> Sent: 18 December 2023 20:44
>>
>> On 12/18/2023 12:27 AM, Suman Ghosh wrote:
>>> Couple of structures was not marked as __packed which may have some
>>> performance implication. This patch fixes the same and mark them as
>>> __packed.
>>
>> Not sure I follow why lack of __packed would have performance
>> implications? I get that __packed is important to ensure layout is
>> correct or to ensure the whole structure has the right size rather than
>> unexpected gaps. I'd guess maybe because the structures size would
>> include padding without __packed, leading to a lot of gaps when
>> combining several structures together...
>>
>> I did test on my system with pahole, and even without __packed, I don't
>> get any gaps in the npc_lt_def_cfg structure:
>>
>>
>>> struct npc_lt_def_cfg {
>>> struct npc_lt_def rx_ol2; /* 0 3 */
>>> struct npc_lt_def rx_oip4; /* 3 3 */
>>> struct npc_lt_def rx_iip4; /* 6 3 */
>>> struct npc_lt_def rx_oip6; /* 9 3 */
>>> struct npc_lt_def rx_iip6; /* 12 3 */
>>> struct npc_lt_def rx_otcp; /* 15 3 */
>>> struct npc_lt_def rx_itcp; /* 18 3 */
>>> struct npc_lt_def rx_oudp; /* 21 3 */
>>> struct npc_lt_def rx_iudp; /* 24 3 */
>>> struct npc_lt_def rx_osctp; /* 27 3 */
>>> struct npc_lt_def rx_isctp; /* 30 3 */
>>> struct npc_lt_def_ipsec rx_ipsec[2]; /* 33 10 */
>>> struct npc_lt_def pck_ol2; /* 43 3 */
>>> struct npc_lt_def pck_oip4; /* 46 3 */
>>> struct npc_lt_def pck_oip6; /* 49 3 */
>>> struct npc_lt_def pck_iip4; /* 52 3 */
>>> struct npc_lt_def_apad rx_apad0; /* 55 4 */
>>> struct npc_lt_def_apad rx_apad1; /* 59 4 */
>>> struct npc_lt_def_color ovlan; /* 63 5 */
>>> /* --- cacheline 1 boundary (64 bytes) was 4 bytes ago --- */
>>> struct npc_lt_def_color ivlan; /* 68 5 */
>>> struct npc_lt_def_color rx_gen0_color; /* 73 5 */
>>> struct npc_lt_def_color rx_gen1_color; /* 78 5 */
>>> struct npc_lt_def_et rx_et[2]; /* 83 10 */
>>>
>>> /* size: 93, cachelines: 2, members: 23 */
>>> /* last cacheline: 29 bytes */
>>> };
>>
>>
>> However that may not be true across all compilers etc. Also all the
>> other structures are __packed. Makes sense.
>
> Or not - maybe all the __packed should be removed instead!
>
> Unless these structures (or any others) appear in 'messages' which
> get transferred between systems they really shouldn't be __packed.
> And a 93 byte 'message' with all those fields seems rather odd.
>
> The above breakdown seems to imply everything is 'unsigned char'
> so the __packed makes no difference.
>
> Using __packed requires the compiler generate byte loads/store
> with shifts (etc) on many architectures and should really be avoided
> unless it is absolutely needed for binary compatibility.
>
> Even then if the problem is a 64bit field that only needs to be
> 32bit aligned (as is common for some compat32 code) then the 64bit
> fields should be marked as being 32bit aligned.
>
> David
>
Right. Typically packed is only required when dealing with something
where the exact binary layout matters (i.e. copying to/from hardware or
across systems in such a way that the layout might change with different
compilers/arch).

If that isn't how this structure is used, then ya, removing __packed
seems reasonable. And at least for one system I see no difference in the
actual generated layout, making __packed redundant.

However, its not clear to me at a glance how this structure is used and
whether it really is copied between places where binary compatibility is
a requirement.

> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)