[PATCH v4 3/5] hexagon/bitops: force inlining of all bit-find functions

From: Vincent Mailhol
Date: Sun Jan 28 2024 - 00:07:08 EST


The inline keyword actually does not guarantee that the compiler will
inline a functions. Whenever the goal is to actually inline a
function, __always_inline should always be preferred instead.

__always_inline is also needed for further optimizations which will
come up in a follow-up patch.

Inline all the bit-find function which have a custom hexagon assembly
implementation, namely: __ffs(), ffs(), ffz(), __fls(), fls().

On linux v6.7 defconfig with clang 17.0.6, it does not impact the
final size, meaning that, overall, those function were already inlined
on modern clangs:

$ size --format=GNU vmlinux.before vmlinux.after vmlinux.final
text data bss total filename
4827900 1798340 364057 6990297 vmlinux.before
4827900 1798340 364057 6990297 vmlinux.after

Reference: commit 8dd5032d9c54 ("x86/asm/bitops: Force inlining of test_and_set_bit and friends")
Link: https://git.kernel.org/torvalds/c/8dd5032d9c54

Signed-off-by: Vincent Mailhol <mailhol.vincent@xxxxxxxxxx>
---
arch/hexagon/include/asm/bitops.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/hexagon/include/asm/bitops.h b/arch/hexagon/include/asm/bitops.h
index 160d8f37fa1a..e856d6dbfe16 100644
--- a/arch/hexagon/include/asm/bitops.h
+++ b/arch/hexagon/include/asm/bitops.h
@@ -200,7 +200,7 @@ arch_test_bit_acquire(unsigned long nr, const volatile unsigned long *addr)
*
* Undefined if no zero exists, so code should check against ~0UL first.
*/
-static inline long ffz(int x)
+static __always_inline long ffz(int x)
{
int r;

@@ -217,7 +217,7 @@ static inline long ffz(int x)
* This is defined the same way as ffs.
* Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32.
*/
-static inline int fls(unsigned int x)
+static __always_inline int fls(unsigned int x)
{
int r;

@@ -238,7 +238,7 @@ static inline int fls(unsigned int x)
* the libc and compiler builtin ffs routines, therefore
* differs in spirit from the above ffz (man ffs).
*/
-static inline int ffs(int x)
+static __always_inline int ffs(int x)
{
int r;

@@ -260,7 +260,7 @@ static inline int ffs(int x)
* bits_per_long assumed to be 32
* numbering starts at 0 I think (instead of 1 like ffs)
*/
-static inline unsigned long __ffs(unsigned long word)
+static __always_inline unsigned long __ffs(unsigned long word)
{
int num;

@@ -278,7 +278,7 @@ static inline unsigned long __ffs(unsigned long word)
* Undefined if no set bit exists, so code should check against 0 first.
* bits_per_long assumed to be 32
*/
-static inline unsigned long __fls(unsigned long word)
+static __always_inline unsigned long __fls(unsigned long word)
{
int num;

--
2.43.0