Re: drivers/gpu/drm/panel/panel-samsung-ld9040.c:240:12: warning: stack frame size of 8312 bytes in function 'ld9040_prepare'

From: Nick Desaulniers
Date: Mon Jun 29 2020 - 17:43:20 EST


On Sat, Jun 27, 2020 at 12:43 PM Vladimir Oltean <olteanv@xxxxxxxxx> wrote:
>
> Hi Nick,
>
> On Mon, 22 Jun 2020 at 19:50, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
> >
>
> > > I really don't get what's the problem here. The listing of
> > > ld9040_prepare at the given commit and with the given .config is:
> >
> > I wrote a tool to help debug these.
> > https://github.com/ClangBuiltLinux/frame-larger-than
> > If you fetch the randconfig and rebuild with debug info, that tool
> > will help show you which variables are used in which stack frames and
> > what their sizes are. Also note that strange things get dug up from
> > randconfigs.
> >
> >
> > --
> > Thanks,
> > ~Nick Desaulniers
>
> I ran your tool and it basically told me that all 11 calls to

Cool? No bugs running it? (I still need to extend support for many
architectures)

> ld9040_dcs_write from within ld9040_init are inlined by the compiler.
> Each of these ld9040_dcs_write functions calls ld9040_spi_write_word
> twice, so 22 inline calls to that. Now, sizeof(struct
> spi_transfer)=136 and sizeof(struct spi_message)=104, so, no wonder we
> run out of stack pretty quickly.

I'd expect these to have distinct lifetimes resulting in stack slot
reuse. When the compiler inlines functions, it introduces a lexical
scope. You can imagine it inlining the body, but within a new `{}`
delineated compound statement. Then the compiler knows that the
variables local to those scopes can't outlive each other, and can
reuse their stack slots in the containing function. Escape analysis
comes into play, too, but I'm not sure that's an issue here.

>
> But my question is: what's wrong with the code, if anything at all?

The general case we try to find+fix with this warning is excessively
large stack allocations that probably should be heap allocated,
percpu, or static. Also, the `noinline_for_stack` function annotation
is used frequently for this.

One known case of issues are the sanitizers, which can generally
prevent the reuse of stack slots. Were any of those set in this
config, since this was a randconfig? I'd check this first, then
consider if `noinline_for_stack` is appropriate on any of the related
functions.

> Why does the compiler try to inline it, and then complain that it's
> using too much stack

The flag -Wframe-larger-than= is a warning on semantics, not really an
optimization flag controlling the maximum stack depth of the function
being inlined into.

> when basically nobody appears to have asked it to
> inline it?

That's not really how inlining works. If you don't specify compiler
attributes, then the compiler can decide to inline or not at its
discretion. The `inline` keyword or its absence doesn't really affect
this. __attribute__((always_inline)) and __attribute__((noinline))
can give you more control, but there are hyper obscure edge cases
where even those don't work as advertised.

--
Thanks,
~Nick Desaulniers