Re: [PATCH] x86: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION

From: Nathan Chancellor
Date: Mon Apr 22 2024 - 15:24:56 EST


Hi Yuntao,

Just a drive by review since I saw this patch via another CC in my
inbox, I would wait for x86 maintainer thoughts before sending a v2.

On Mon, Apr 22, 2024 at 06:05:56AM +0000, Yuntao Liu wrote:
> The current x86 architecture does not yet support the
> HAVE_LD_DEAD_CODE_DATA_ELIMINATION feature. x86 is widely used in
> embedded scenarios, and enabling this feature would be beneficial for
> reducing the size of the kernel image.
>
> In order to make this work, we keep the necessary tables by annotating
> them with KEEP, also it requires further changes to linker script to KEEP
> some tables and wildcard compiler generated sections into the right place.
>
> Enabling CONFIG_UNWINDER_ORC or CONFIG_MITIGATION_RETPOLINE will enable
> the objtool's --orc and --retpoline parameters, which will alter the
> layout of the binary file, thereby preventing gc-sections from functioning
> properly. Therefore, HAVE_LD_DEAD_CODE_DATA_ELIMINATION should only be
> selected when they are not enabled.
>
> Enabling CONFIG_LTO_CLANG or CONFIG_X86_KERNEL_IBT will use vmlinux.o
> instead of performing the slow LTO link again. This can also prevent
> gc-sections from functioning properly. Therefore, using this optimization
> when CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is not enabled.

These two paragraphs indicate to me that this feature will be
unselectable the vast majority of x86 configurations, why should the
upstream kernel support it in that case?

> The size comparison of zImage is as follows:

^ bzImage?

> x86_def_defconfig i386_defconfig tinyconfig
> 10892288 10826240 607232 no dce
> 10748928 10719744 529408 dce
> 1.3% 0.98% 12.8% shrink
>
> When using smaller config file, there is a significant reduction in the
> size of the zImage.

Same here.

What toolchain was this tested with? There have been behavior
differences between the GNU and LLVM toolchains that have shown up when
dead code elimination is enabled, such as with 32-bit ARM [1] and RISC-V
[2]. While I am not saying there are any problems here, it would be good
to qualify how well this has been tested and perhaps do some testing
with other toolchains and versions, especially since you are touching
areas guarded by CONFIG_LTO_CLANG. Does the resulting kernel boot and
run properly?

[1]: https://lore.kernel.org/30b01c65-12f2-4ee0-81d5-c7a2da2c36b4@xxxxxxxxxxxxxxxx/
[2]: https://lore.kernel.org/20230622215327.GA1135447@dev-arch.thelio-3990X/

> ---
> arch/x86/Kconfig | 1 +
> arch/x86/kernel/vmlinux.lds.S | 24 ++++++++++++------------
> scripts/link-vmlinux.sh | 2 +-
> 3 files changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index a902680b6537..92dfbc8ee4e7 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -247,6 +247,7 @@ config X86
> select HAVE_FUNCTION_ERROR_INJECTION
> select HAVE_KRETPROBES
> select HAVE_RETHOOK
> + select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if !CONFIG_UNWINDER_ORC && !CONFIG_MITIGATION_RETPOLINE

This is incorrect, it should be

select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if !UNWINDER_ORC && !MITIGATION_RETPOLINE

> select HAVE_LIVEPATCH if X86_64
> select HAVE_MIXED_BREAKPOINTS_REGS
> select HAVE_MOD_ARCH_SPECIFIC

Cheers,
Nathan