Re: [PATCH] modpost: add allow list for llvm IPSCCP

From: Nick Desaulniers
Date: Wed Sep 29 2021 - 20:19:09 EST


On Wed, Sep 29, 2021 at 4:28 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Sep 29, 2021 at 3:59 PM Nick Desaulniers
> <ndesaulniers@xxxxxxxxxx> wrote:
> >
> > +static const struct secref_exception secref_allowlist[] = {
> > + { .fromsym = "__first_node", .tosym = "numa_nodes_parsed" },
> > + { .fromsym = "__next_node", .tosym = "numa_nodes_parsed" },
> > + { .fromsym = "__nodes_weight", .tosym = "numa_nodes_parsed" },
> > + { .fromsym = "early_get_smp_config", .tosym = "x86_init" },
> > + { .fromsym = "test_bit", .tosym = "numa_nodes_parsed" },
> > +};

Thanks for your feedback. This has been a long-standing issue with no
clear path forward; I was looking forward to your input.

>
> This list is basically made-up and random.

Definitely brittle. And it contains checks that are specific to
basically one set of configs for one arch. It sucks to pay that cost
for unaffected architectures.

> Why did those functions not get inlined?

$ make LLVM=1 -j72 allmodconfig
$ make LLVM=1 -j72 arch/x86/mm/amdtopology.o KCFLAGS=-Rpass-missed=inline.
...
arch/x86/mm/amdtopology.c:110:7: remark: 'test_bit' not inlined into
'amd_numa_init' because too costly to inline (cost=115, threshold=45)
[-Rpass-missed=inline]
if (node_isset(nodeid, numa_nodes_parsed)) {
^
arch/x86/mm/amdtopology.c:157:7: remark: '__nodes_weight' not inlined
into 'amd_numa_init' because too costly to inline (cost=60,
threshold=45) [-Rpass-missed=inline]
if (!nodes_weight(numa_nodes_parsed))
^
arch/x86/mm/amdtopology.c:171:2: remark: 'early_get_smp_config' not
inlined into 'amd_numa_init' because too costly to inline (cost=85,
threshold=45) [-Rpass-missed=inline]
early_get_smp_config();
^
arch/x86/mm/amdtopology.c:178:2: remark: '__first_node' not inlined
into 'amd_numa_init' because too costly to inline (cost=70,
threshold=45) [-Rpass-missed=inline]
for_each_node_mask(i, numa_nodes_parsed)
^
arch/x86/mm/amdtopology.c:178:2: remark: '__next_node' not inlined
into 'amd_numa_init' because too costly to inline (cost=95,
threshold=45) [-Rpass-missed=inline]


ie. for allmodconfig, the sanitizers add too much instrumentation to
the callees that they become too large to be considered profitable to
inline by the cost model. Note that LLVM's inliner works bottom up,
not top down.

Though for the defconfig case...somehow the cost is more than with the
sanitizers...

arch/x86/mm/amdtopology.c:157:7: remark: '__nodes_weight' not inlined
into 'amd_numa_init' because too costly to inline (cost=930,
threshold=45) [-Rpass-missed=inline]
if (!nodes_weight(numa_nodes_parsed))
^

Looking at the output of `make LLVM=1 -j72
arch/x86/mm/amdtopology.ll`, @__nodes_weight is just some inline asm
(.altinstructions). I wonder if I need to teach the cost model about
`asm inline`...

For the allmodconfig build it looks like `__nodes_weight` calls
`__bitmap_weight` and the code coverage runtime hooks.

> Wouldn't it be better to make
> them always-inline?

Perhaps, see what that might look like:
https://github.com/ClangBuiltLinux/linux/issues/1302#issuecomment-807260475
Does that look better?

> Or, like in at least the early_get_smp_config() case, just make it be
> marked __init, so that if it doesn't get inlined it gets the right
> section?

In the case of early_get_smp_config(), that's what Boris suggested:
https://lore.kernel.org/lkml/20210225114533.GA380@xxxxxxx/

>
> It seems silly to add random source mappings to a checking program.
>
> It was bad for the gcc constprop hack, but at least there it was a

Part of me feels like modpost not warning on those is permitting a
"memory leak," in so far as code that's only called from .init callers
is never reclaimed. Or leaving behind gadgets...

> clear case of "this inlining failed". This ad-hoc list has cases of
> things that are clearly wrong in general ("test_bit()" must not use
> initdata), and that "ok, the function just doesn't have the right
> section marker.

Sorry, what do you mean "test_bit() must not use initdata?" Because it
can lead to problems like this? Or...?

include/linux/nodemask.h has a comment that I'd bet predates that
modpost "Pattern 5" gcc constprop hack.
https://github.com/ClangBuiltLinux/linux/blob/83d09ad4b950651a95d37697f1493c00d888d0db/include/linux/nodemask.h#L119-L125

>
> (All of get_smp_config/early_get_smp_config/find_smp_config should be
> __init, since they most definitely cannot work after __init time - but
> why a compiler doesn't just inline them when they are one single
> indirect call, I don't really get)
>
> Linus



--
Thanks,
~Nick Desaulniers