Re: [PATCH] fs/select: avoid clang stack usage warning
From: Nick Desaulniers
Date: Mon Oct 10 2022 - 14:40:56 EST
On Fri, Oct 7, 2022 at 3:54 PM Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
>
> This seems to be affected by -fno-conserve-stack, a currently gcc-only
> command line flag. If I remove that, then i386 defconfig will inline
> do_select but x86_64 defconfig will not.
>
> I have a sneaking suspicion that -fno-conserve-stack and
> -Wframe-larger-than conspire in GCC to avoid inlining when doing so
> would trip `-Wframe-larger-than` warnings, but it's just a conspiracy
> theory; I haven't read the source. Probably should implement exactly
> that behavior in LLVM.
Sorry, that should have read `-fconserve-stack` (not `-fno-conserve-stack`).
Playing with:
https://godbolt.org/z/hE67j1Y9G
experimentally, it looks like irrespective of -Wframe-larger-than,
-fconserve-stack will try to avoid inlining callees if their frame
size is 512B or greater for x86-64 targets, and 410B or greater for
32b targets. aarch64 is 410B though, perhaps that's leftover from the
32b ARM port. There's probably more to the story there though.
> I'll triple check 32b+64b arm configs next week to verify. But if GCC
> is not inlining do_select into core_sys_select then I think my patch
> https://lore.kernel.org/llvm/20221007201140.1744961-1-ndesaulniers@xxxxxxxxxx/
> is on the right track; probably could drop the 32b-only condition and
> make a note of GCC in the commit message.
arm64 does not inline do_select into core_sys_select with
aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110
for defconfig.
$ CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 make -j128 defconfig fs/select.o
$ llvm-objdump -Dr --disassemble-symbols=core_sys_select fs/select.o |
grep do_select
1a48: 2e fb ff 97 bl 0x700 <do_select>
Same for 32b ARM.
arm-linux-gnueabi-gcc (Debian 10.2.1-6) 10.2.1 20210110
$ CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm make -j128 defconfig fs/select.o
$ llvm-objdump -Dr --disassemble-symbols=core_sys_select fs/select.o |
grep do_select
1620: 07 fc ff eb bl #-4068 <do_select>
Is there a set of configs or different compiler version for which
that's not the case? Perhaps. But it doesn't look like marking
do_select noinline_for_stack changes the default behavior for GCC
builds, which is good.
So it looks like it's just clang being aggressive with inlining since
it doesn't have -fconserve-stack. I think
https://lore.kernel.org/lkml/20221007201140.1744961-1-ndesaulniers@xxxxxxxxxx/
is still on the right track, though I'd remove the 32b only guard for
v2.
Christophe mentioned something about KASAN and GCC. I failed to
reproduce, and didn't see any reports on lore that seemed relevant.
https://lore.kernel.org/lkml/20221010074409.GA20998@xxxxxx/
I'll wait a day to see if there's more info (a config that reproduces)
before sending a v2 though.
> Also, my colleague Paul just whipped up a neat tool to help debug
> -Wframe-larger-than.
> https://reviews.llvm.org/D135488
> See the output from my run here:
> https://paste.debian.net/1256338/
> It's a very early WIP, but I think it would be incredibly helpful to
> have this, and will probably help us improve Clang's stack usage.
Paul also mentioned that -finline-max-stacksize is a thing, at least for clang.
https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-finline-max-stacksize
Though this only landed recently
https://reviews.llvm.org/rG8564e2fea559 and wont ship until clang-16.
That feels like a large hammer for core_sys_select/do_select; I think
we can use a fine scalpel. But it might be interesting to use that
with KASAN.
--
Thanks,
~Nick Desaulniers