Re: [PATCH v3 1/1] mtd: cfi_cmdset_0001: Factor out do_write_buffer_locked() to reduce stack frame

From: Andy Shevchenko

Date: Tue Apr 14 2026 - 08:42:23 EST


On Thu, Apr 09, 2026 at 12:28:46PM +0100, David Laight wrote:
> On Thu, 9 Apr 2026 09:58:28 +0200
> Lukas Wunner <lukas@xxxxxxxxx> wrote:
> > On Thu, Apr 09, 2026 at 08:26:11AM +0100, David Laight wrote:
> > > On Wed, 8 Apr 2026 23:11:48 +0200 Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
> > > > Compiler is not happy about used stack frame:
> > > >
> > > > drivers/mtd/chips/cfi_cmdset_0001.c: In function 'do_write_buffer':
> > > > drivers/mtd/chips/cfi_cmdset_0001.c:1887:1: error: the frame size of 1296 bytes is larger than 1280 bytes [-Werror=frame-larger-than=]
> > > >
> > > > Fix this by factoring out do_write_buffer_locked().
> > >
> > > Does this just split the large stack frame between two nested functions?
> > > I'd also expect the compiler to inline do_write_buffer_locked() so it
> > > makes little difference.
> > > OTOH I can't immediately see where the large stack frame comes from.
> >
> > The error occurs for an allmodconfig build on arm, which implies
> > CONFIG_KASAN_STACK=y and thus increases stack usage vis-à-vis a
> > "regular" build.
> >
> > Stack usage is high here because of the three "map_word" types,
> > which can each be up to 256 unsigned longs (32 * 8), see the
> > definitions of MAX_MAP_LONGS, MAX_MAP_BANKWIDTH, map_word in
> > include/linux/mtd/map.h.
>
> Ugg - that code is horrid.
> Returning structures by value isn't really a good idea.
>
> >
> > Possible solutions:
> >
> > - Disable KASAN entirely for this file:
> > https://lore.kernel.org/all/adX3SHYgazijahbG@xxxxxxxxx/
> >
> > Not always a good option, particularly for stuff like lib/maple_tree.c
> > where the same issue exists in mas_wr_spanning_store() and KASAN would
> > certainly be good to have for that one.
>
> I've peeked at that at least once.
> Some big functions get inlined; IIRC one small function is basically:
> if (expr) a(args) else b(args);
> and marking both a and b noinline would help a lot.
>
> >
> > - Use heap instead of stack.
> >
> > - Split function in smaller chunks and mark them "noinline".
>
> That might make the code easier to read as well.
>
> But looking at it, I think that a small amount of refactoring
> (mostly moving the initial 'status' check before the command
> is written) would mean that only one 'map_word' would be valid
> at any one time.
>
> I didn't look at what was really happening though.
> I suspect it is similar to some code I've written for accessing serial
> EEPROM where the control data is written one bit at a time, but the
> data itself is read/written in 4 bit chunks (although the low-level hardware
> did multiple 'nibble' accesses for wider transfers).
> In any case it surely can't be necessary to have a 256+ byte structure
> to hold the 8-bit command/status values.
> (In my case the 8 bits got 'spread' across a 32bit word and written
> (to the fgpa - helped because I was writing that end as well) as a single word.)

Okay, I leave this to maintainers to decide what to do with my contribution.
Dunno if this refactoring helps doing better one in the future (like David
suggested) or should be rewritten completely. In my opinion, smaller functions
are always easier to follow.

--
With Best Regards,
Andy Shevchenko