Re: [PATCH v9 09/27] rust: add `compiler_builtins` crate

From: Nick Desaulniers
Date: Wed Aug 24 2022 - 14:39:05 EST


On Mon, Aug 22, 2022 at 4:55 PM Nick Desaulniers
<ndesaulniers@xxxxxxxxxx> wrote:
>
> On Fri, Aug 5, 2022 at 8:44 AM Miguel Ojeda <ojeda@xxxxxxxxxx> wrote:
> >
> > Rust provides `compiler_builtins` as a port of LLVM's `compiler-rt`.
> > Since we do not need the vast majority of them, we avoid the
> > dependency by providing our own crate.
> >
> > Co-developed-by: Alex Gaynor <alex.gaynor@xxxxxxxxx>
> > Signed-off-by: Alex Gaynor <alex.gaynor@xxxxxxxxx>
> > Co-developed-by: Wedson Almeida Filho <wedsonaf@xxxxxxxxxx>
> > Signed-off-by: Wedson Almeida Filho <wedsonaf@xxxxxxxxxx>
> > Co-developed-by: Sven Van Asbroeck <thesven73@xxxxxxxxx>
> > Signed-off-by: Sven Van Asbroeck <thesven73@xxxxxxxxx>
> > Co-developed-by: Gary Guo <gary@xxxxxxxxxxx>
> > Signed-off-by: Gary Guo <gary@xxxxxxxxxxx>
> > Signed-off-by: Miguel Ojeda <ojeda@xxxxxxxxxx>
>
> I still really really do not like this patch. Thoughts below.
>
> > ---
> > rust/compiler_builtins.rs | 63 +++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 63 insertions(+)
> > create mode 100644 rust/compiler_builtins.rs
> >
> > diff --git a/rust/compiler_builtins.rs b/rust/compiler_builtins.rs
> > new file mode 100644
> > index 000000000000..f8f39a3e6855
> > --- /dev/null
> > +++ b/rust/compiler_builtins.rs
> > @@ -0,0 +1,63 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +//! Our own `compiler_builtins`.
> > +//!
> > +//! Rust provides [`compiler_builtins`] as a port of LLVM's [`compiler-rt`].
> > +//! Since we do not need the vast majority of them, we avoid the dependency
> > +//! by providing this file.
> > +//!
> > +//! At the moment, some builtins are required that should not be. For instance,
> > +//! [`core`] has 128-bit integers functionality which we should not be compiling
>
> It's not just 128b, these are for double word sizes, so for 32b
> targets (I recall an earlier version of series supporting armv6) these
> symbols are for 64b types.
>
> > +//! in. We will work with upstream [`core`] to provide feature flags to disable
> > +//! the parts we do not need. For the moment, we define them to [`panic!`] at
> > +//! runtime for simplicity to catch mistakes, instead of performing surgery
> > +//! on `core.o`.
>
> Wait, so Kbuild performs surgery on alloc, but core is off limits? bah
>
> > +//!
> > +//! In any case, all these symbols are weakened to ensure we do not override
> > +//! those that may be provided by the rest of the kernel.
> > +//!
> > +//! [`compiler_builtins`]: https://github.com/rust-lang/compiler-builtins
> > +//! [`compiler-rt`]: https://compiler-rt.llvm.org/
> > +
> > +#![feature(compiler_builtins)]
> > +#![compiler_builtins]
> > +#![no_builtins]
> > +#![no_std]
> > +
> > +macro_rules! define_panicking_intrinsics(
> > + ($reason: tt, { $($ident: ident, )* }) => {
> > + $(
> > + #[doc(hidden)]
> > + #[no_mangle]
> > + pub extern "C" fn $ident() {
> > + panic!($reason);
> > + }
> > + )*
> > + }
> > +);
> > +
> > +define_panicking_intrinsics!("`f32` should not be used", {
> > + __eqsf2,
>
> I don't see ^ this symbol in core.
>
> $ llvm-nm rust/core.o | grep "U __" | sort | tr -s " "
> U __gesf2
> U __lesf2
> U __udivti3
> U __umodti3
> U __unorddf2
> U __unordsf2
> U __x86_indirect_thunk_r11
>
> so a few of these seem extraneous. Are you sure they're coming from
> core? (or was it the different arch support that was removed from v8
> to facilitate v9?)
>
> __gesf2, __lesf2, and __unordsf2 all seem to be coming from
> <<f32>::classify> and <<f32>::to_bits::ct_f32_to_u32>. Surely we can
> avoid more f32 support by relying on objcopy
> -N/--strip-symbol=/--strip-symbols=filename to slice out/drop symbols
> from core.o?
>
> <core::fmt::num::exp_u128> uses __umodti3+__udivti3
>
> __unorddf2 is referenced by <<f64>::classify> and
> <<f64>::to_bits::ct_f64_to_u64>.
>
> Can we slice those 5 symbols out from core, or are they further
> referenced from within core? If we can, we can drop this patch
> outright, and avoid a false-negative when C code generates references
> to these symbols from the compiler runtime, for architectures where
> don't want to provide a full compiler runtime. (--rtlib=none is a
> pipe dream; first we need to get to the point where we're not
> -ffreestanding for all targets...)
>
> I suspect someone can iteratively whittle down core to avoid these symbols?
>
> ```
> diff --git a/rust/Makefile b/rust/Makefile
> index 7700d3853404..e64a82c21156 100644
> --- a/rust/Makefile
> +++ b/rust/Makefile
> @@ -359,6 +359,7 @@ $(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs
> $(obj)/target.json FORCE
> $(obj)/compiler_builtins.o: private rustc_objcopy = -w -W '__*'
> $(obj)/compiler_builtins.o: $(src)/compiler_builtins.rs $(obj)/core.o FORCE
> $(call if_changed_dep,rustc_library)
> + @$(OBJCOPY) --strip-symbols=$(obj)/strip-symbols.txt $(obj)/core.o
>
> $(obj)/alloc.o: private skip_clippy = 1
> $(obj)/alloc.o: private skip_flags = -Dunreachable_pub
> ```
> then define symbols in rust/strip-symbols.txt?
>
> I'm getting the sense that no, there are many functions like
> <usize as core::fmt::UpperExp>::fmt
> <i64 as core::fmt::UpperExp>::fmt
> <u64 as core::fmt::UpperExp>::fmt

Another idea I had, and Ard mentioned to me this morning that efistub
does something similar:

I think objcopy can rename symbol references. So instead of calling
foo, I think you can use objcopy to call bar instead.

To avoid the case of Rust core refering to symbols typically provided
by the --rtlib={libgcc|compiler-rt}, what if we:
1. build core.o (with reference to the unfavorable symbols)
2. use objcopy to rename references in core.o to the symbols listed in
this file; maybe prepend them with `rust_core_` or something.
3. provide this crate with panic'ing definitions, but for the renamed symbols.

That way, enabling CONFIG_RUST=y won't hide the case where C code is
triggering such libcalls from the C compiler?

To restate the concern, I don't want CONFIG_RUST=y to hide that fact
that someone wrote `float x, y; if (x != y) ...` in C code, where the
C compiler will likely lower that to a libcall to __nesf2 since with
the current approach in this patch, linkage would proceed. The current
behavior is to intentionally fail to link. I think what I describe
above may work to keep this intended behavior of the kernel's build
system.

Ah, looks like there's objcopy flags:
--redefine-sym old=new
Change the name of a symbol old, to new. This can be
useful when one is trying link
two things together for which you have no source, and there
are name collisions.

--redefine-syms=filename
Apply --redefine-sym to each symbol pair "old new" listed
in the file filename.
filename is simply a flat file, with one symbol pair per
line. Line comments may be
introduced by the hash character. This option may be given
more than once.

So probably could start with my diff from above, but use
--redefine-syms=filename instead of --strip-symbols=.


>
> > + __gesf2,
> > + __lesf2,
> > + __nesf2,
> > + __unordsf2,
> > +});
> > +
> > +define_panicking_intrinsics!("`f64` should not be used", {
> > + __unorddf2,
> > +});
> > +
> > +define_panicking_intrinsics!("`i128` should not be used", {
> > + __ashrti3,
> > + __muloti4,
> > + __multi3,
> > +});
> > +
> > +define_panicking_intrinsics!("`u128` should not be used", {
> > + __ashlti3,
> > + __lshrti3,
> > + __udivmodti4,
> > + __udivti3,
> > + __umodti3,
> > +});
> > --
> > 2.37.1
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers



--
Thanks,
~Nick Desaulniers