Re: [PATCH] lib/clz_ctz.c: Fix __clzdi2() and __ctzdi2() for 32-bit kernels
From: Nick Desaulniers
Date: Mon Aug 28 2023 - 16:14:49 EST
On Mon, Aug 28, 2023 at 9:25 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, 28 Aug 2023 at 00:33, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> >
> > Several architectures (incl. x86, but excl. amd64) do build the kernel with
> > -freestanding.
> >
> > IIRC, the issue was that without that, gcc was "optimizing" calls
> > to standard functions (implemented as inline optimized assembler
> > functions) by replacing them with calls to other standard functions
> > (also implemented as inline optimized assembler functions).
>
> So using -ffreestanding is definitely the right thing to do for a
> kernel in theory. It's very much supposed to tell the compiler to not
-ffreestanding is probably a good suggestion for any embedded
platform. But given the size of the kernel, and similarities of
symbols and their semantics expected by the compiler and provided by
the kernel, I think -ffreestanding should not be set at this point for
the Linux kernel.
> assume a standard libc, and without that gcc will do various
> transformations that make sense when you "know" what libc does, but
> may not make sense in the limited library model of a kernel.
>
> So without it, gcc will do things like converting a 'printf()' call
> without any conversion characters to a much cheaper 'puts()' etc. Now,
> we often avoid that issue entirely by having our own function names
> (ie printk()), but we do tend to use the *really* core C library
> names.
>
> Anyway, it turns out that some of the things you miss out on with
> -ffreestanding are kind of important. In particular, at least gcc will
> stop some 'memcpy()' optimizations too, which ends up being pretty
> horrendous.
>
> So while -ffreestanding would be the right thing to do in theory, in
> practice it's actually pretty horrible. It's a big hammer that affects
> a lot of things, and while many of them make sense for a kernel, some
> of them are really bad. Which is why x86-64 no longer uses it.
I agree.
>
> I would actually suggest other architectures take a look if they care
> at all about code generation. In particular, look at the x86-64
> version of 'string.h' in
>
> arch/x86/include/asm/string_64.h
>
> and note the difference with the 32-bit one. The 32-bit one is the
> "this is how we used to do it" that nobody cared enough to change. The
> 64-bit one is much simpler and actually generates better code simply
> because gcc recognizes memcpy() and friends, and will then inline it
> when small etc.
>
> The *downside* is that now you have to trust the compiler to do the
> right thing. And that will depend on compiler version etc. There's a
> reason why 32-bit x86 does everything by hand: when your compiler
> history starts at gcc-1.40, things are simply *very* different from
> when you now rely on gcc-5.1 and newer...
>
> Put another way: gcc has changed, and what used to make sense probably
> doesn't make sense any more.
Yep, I think it's time to review the use of -ffreestanding in the linux kernel.
--
Thanks,
~Nick Desaulniers