Re: ipv4/tcp.c:4234:1: error: the frame size of 1152 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
From: Linus Torvalds
Date: Wed Sep 08 2021 - 12:06:47 EST
On Wed, Sep 8, 2021 at 3:03 AM Wei Liu <wei.liu@xxxxxxxxxx> wrote:
>
> Thanks for the heads-up. I found one instance of this bad practice in
> hv_apic.c. Presumably that's the one you were referring to.
Yeah, that must have been the one I saw.
> However calling into the allocator from that IPI path seems very heavy
> weight. I will discuss with fellow engineers on how to fix it properly.
In other places, the options have been fairly straightforward:
(a) avoid the allocation entirely.
I think the main reason hyperv does it is because it wants to clear
the "current cpu" from the cpumask for the ALL_BUT_SELF case, and if
you can just instead keep track of that "all but self" bit separately
and pass it down the call chain, you may not need that allocation at
all.
(b) use a static percpu allocation
The IPI code generally wants interrupts disabled anyway, or it
certainly can just do the required preemption disable to make sure
that it has exclusive access to a percpu allocation.
(c) take advantage of any hypervisor limitations
If hyperv is limited to some smaller number of CPU's - google seems to
imply 240 - maybe you can keep such a smaller allocation on the stack,
but just verify that the incoming cpumask is less than whatever the
hyperv limit is.
That (c) is obviously the worst choice in some sense, but in some
cases limiting yourself to simplify things isn't wrong.
I suspect the percpu allocation model is the easiest one. It's what
other IPI code does. See for example
arch/x86/kernel/apic/x2apic_cluster.c:
__x2apic_send_IPI_mask()
and that percpu 'ipi_mask' thing.
That said, if you are already just iterating over the mask, doing (a)
may be trivial. No allocation at all is even better than a percpu one.
I'm sure there are other options. The above is just the obvious ones
that come to mind.
Linus