Re: [PATCH] Enable '-Werror' by default for all kernel builds

From: Harry Wentland
Date: Tue Sep 07 2021 - 23:52:24 EST




On 2021-09-07 5:07 p.m., Harry Wentland wrote:
>
>
> On 2021-09-07 1:33 p.m., Linus Torvalds wrote:
>> On Tue, Sep 7, 2021 at 10:10 AM Linus Torvalds
>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> Do I know why? No. I do note that that code is disgusting.
>>>
>>> It's passing one of those structs around by value, for example. That's
>>> a 72-byte structure that is copied on the stack due to stupid calling
>>> conventions. Maybe clang generates a few extra temporaries for it as
>>> part of the function call stack setup? Who knows..
>>
>> Ooh, yes.
>>
>> This attached patch is crap - it converts the helper functions to use
>> const pointers instead of passing the whole structure, but it then
>> only converts that one file that *uses* them.
>>
>> So the end result will not compile in general, but you can do
>>
>> make drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_rq_dlg_calc_30.o
>>
>> and it compiles for me.
>>
>> And while gcc doesn't care that much - it will apparently either
>> generate the argument stack every call - clang cares deeply.
>>
>> The nasty 720-byte stack frame that clang generates turns into just a
>> 320-byte one, and code generation in general looks a *lot* better.
>>
>> Now, as mentioned, this patch is broken and incomplete. But I really
>> think the AMD GPU people need to do this. It makes those functions go
>> from practically unusable to not horribly disgusting.
>>
>> So Harry/Leo/Alex/Christian and amd-gfx list - can you look into
>> making this ugly "make one file compile better" patch actually work
>> properly?
>>
>
> Yes, will take a look at this tonight. We definitely shouldn't be passing
> large structs by value.
>

Attached patches fix these x86_64 ones reported by Nick:

x86_64-alpine.log:drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:452:13: error: stack frame size (1800) exceeds limit (1280) in function 'dcn_bw_calc_rq_dlg_ttu' [-Werror,-Wframe-larger-than]
x86_64-alpine.log:drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_rq_dlg_calc_21.c:1657:6: error: stack frame size (1336) exceeds limit (1280) in function 'dml21_rq_dlg_get_dlg_reg' [-Werror,-Wframe-larger-than]
x86_64-alpine.log:drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_rq_dlg_calc_30.c:1831:6: error: stack frame size (1352) exceeds limit (1280) in function 'dml30_rq_dlg_get_dlg_reg' [-Werror,-Wframe-larger-than]

I'm also seeing one more that might be more challenging to fix but is nearly at 1024:

drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:3397:6: error: stack frame size of 1064 bytes in function 'dml21_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than=]

The attached patches build and boot without error or warning on a Radeon RX 5500 XT.

Harry

> Harry
>
>> It *looks* lto me ike that code was perhaps written for a C++ compiler
>> and the helpers have been written as a "pass by reference", and the
>> arguments used to be
>>
>> const display_data_rq_misc_params_st& rq_misc_param
>>
>> and then the compiler will pass the argument as a pointer. And then it
>> was converted to C, and the "pass by reference" in the function
>> declaration was turned into "pass by value", to avoid changing "." to
>> "->" in the use.
>>
>> But a '&arg' thing in C++ really is a '*arg' pointer in C, and should
>> have been done as that.
>>
>> Of course, it's also possible that that code was simply written by
>> somebody who didn't understand just *how* horrible it is to pass
>> structures bigger than a word or two by value.
>>
>> Do we have a compiler warning for passing big structures by value?
>>
>> Linus
>>
>