[tip: x86/core] x86/asm: Use asm_inline() instead of asm() in clwb()

From: tip-bot2 for Uros Bizjak
Date: Wed Mar 19 2025 - 07:03:46 EST


The following commit has been merged into the x86/core branch of tip:

Commit-ID: f685a96bfd7963a587c76bd5709f2d9170820875
Gitweb: https://git.kernel.org/tip/f685a96bfd7963a587c76bd5709f2d9170820875
Author: Uros Bizjak <ubizjak@xxxxxxxxx>
AuthorDate: Thu, 13 Mar 2025 11:26:56 +01:00
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitterDate: Wed, 19 Mar 2025 11:26:58 +01:00

x86/asm: Use asm_inline() instead of asm() in clwb()

Use asm_inline() to instruct the compiler that the size of asm()
is the minimum size of one instruction, ignoring how many instructions
the compiler thinks it is. ALTERNATIVE macro that expands to several
pseudo directives causes instruction length estimate to count
more than 20 instructions.

bloat-o-meter reports slight increase of the code size
for x86_64 defconfig object file, compiled with gcc-14.2:

add/remove: 0/2 grow/shrink: 3/0 up/down: 190/-59 (131)

Function old new delta
__copy_user_flushcache 166 247 +81
__memcpy_flushcache 369 437 +68
arch_wb_cache_pmem 6 47 +41
__pfx_clean_cache_range 16 - -16
clean_cache_range 43 - -43

Total: Before=22807167, After=22807298, chg +0.00%

The compiler now inlines and removes the clean_cache_range() function.

Signed-off-by: Uros Bizjak <ubizjak@xxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Brian Gerst <brgerst@xxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Link: https://lore.kernel.org/r/20250313102715.333142-2-ubizjak@xxxxxxxxx
---
arch/x86/include/asm/special_insns.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 9b10bd1..6266d6b 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -185,7 +185,7 @@ static inline void clwb(volatile void *__p)
{
volatile struct { char x[64]; } *p = __p;

- asm volatile(ALTERNATIVE_2(
+ asm_inline volatile(ALTERNATIVE_2(
"ds clflush %0",
"clflushopt %0", X86_FEATURE_CLFLUSHOPT,
"clwb %0", X86_FEATURE_CLWB)