[PATCH] cpumask: Optimize cpumask_any_but()

From: Kuan-Wei Chiu
Date: Fri Jan 17 2025 - 09:27:12 EST


The cpumask_any_but() function can avoid using a loop to determine the
CPU index to return. If the first set bit in the cpumask is not equal
to the specified CPU, we can directly return the index of the first set
bit. Otherwise, we return the next set bit's index.

This optimization replaces the loop with a single if statement,
allowing the compiler to generate more concise and efficient code.

As a result, the size of the bzImage built with x86 defconfig is
reduced by 4096 bytes:

* Before:
$ size arch/x86/boot/bzImage
text data bss dec hex filename
13537280 1024 0 13538304 ce9400 arch/x86/boot/bzImage

* After:
$ size arch/x86/boot/bzImage
text data bss dec hex filename
13533184 1024 0 13534208 ce8400 arch/x86/boot/bzImage

Co-developed-by: Yu-Chun Lin <eleanor15x@xxxxxxxxx>
Signed-off-by: Yu-Chun Lin <eleanor15x@xxxxxxxxx>
Signed-off-by: Kuan-Wei Chiu <visitorckw@xxxxxxxxx>
---
Not sure how to measure the efficiency difference, but I guess this
patch might be slightly more efficient or nearly the same as before. If
you have any good ideas for measuring efficiency, please let me know!

include/linux/cpumask.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 9278a50d514f..b769fcdbaa10 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -404,10 +404,10 @@ unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
unsigned int i;

cpumask_check(cpu);
- for_each_cpu(i, mask)
- if (i != cpu)
- break;
- return i;
+ i = find_first_bit(cpumask_bits(mask), small_cpumask_bits);
+ if (i != cpu)
+ return i;
+ return find_next_bit(cpumask_bits(mask), small_cpumask_bits, i + 1);
}

/**
--
2.34.1