[PATCH v4 0/3] lib: find_*_bit reimplementation

From: Yury Norov
Date: Mon Feb 16 2015 - 21:36:14 EST


This patchset does rework find_bit functions family to achieve better
performance, and decrease size of text. All rework is done in patch 1.
Patches 2 and 3 are about code moving and renaming.

It was boot-tested on x86_64 and MIPS (big-endian) machines.
Performance tests were ran on userspace with code like this:

/* addr[] is filled from /dev/urandom */
start = clock();
while (ret < nbits)
ret = find_next_bit(addr, nbits, ret + 1);

end = clock();
printf("%ld\t", (unsigned long) end - start);

On Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz measuremets are next:
(for find_next_bit, nbits is 8M, for find_first_bit - 80K)

find_next_bit: find_first_bit:
new current new current
26932 43151 14777 14925
26947 43182 14521 15423
26507 43824 15053 14705
27329 43759 14473 14777
26895 43367 14847 15023
26990 43693 15103 15163
26775 43299 15067 15232
27282 42752 14544 15121
27504 43088 14644 14858
26761 43856 14699 15193
26692 43075 14781 14681
27137 42969 14451 15061
... ...

find_next_bit performance gain is 35-40%;
find_first_bit - no measurable difference.

On ARM machine, there is arch-specific implementation for find_bit.
To disable it, and use generic one, please apply next patch:

---
arch/arm/include/asm/bitops.h | 19 -------------------
arch/arm/kernel/armksyms.c | 11 -----------
arch/arm/lib/Makefile | 2 +-
include/asm-generic/bitops/le.h | 1 +
4 files changed, 2 insertions(+), 31 deletions(-)

diff --git a/arch/arm/include/asm/bitops.h b/arch/arm/include/asm/bitops.h
index 5638099..e0611d1 100644
--- a/arch/arm/include/asm/bitops.h
+++ b/arch/arm/include/asm/bitops.h
@@ -192,25 +192,6 @@ extern int _find_next_bit_be(const unsigned long *p, int size, int offset);
#define test_and_clear_bit(nr,p) ATOMIC_BITOP(test_and_clear_bit,nr,p)
#define test_and_change_bit(nr,p) ATOMIC_BITOP(test_and_change_bit,nr,p)

-#ifndef __ARMEB__
-/*
- * These are the little endian, atomic definitions.
- */
-#define find_first_zero_bit(p,sz) _find_first_zero_bit_le(p,sz)
-#define find_next_zero_bit(p,sz,off) _find_next_zero_bit_le(p,sz,off)
-#define find_first_bit(p,sz) _find_first_bit_le(p,sz)
-#define find_next_bit(p,sz,off) _find_next_bit_le(p,sz,off)
-
-#else
-/*
- * These are the big endian, atomic definitions.
- */
-#define find_first_zero_bit(p,sz) _find_first_zero_bit_be(p,sz)
-#define find_next_zero_bit(p,sz,off) _find_next_zero_bit_be(p,sz,off)
-#define find_first_bit(p,sz) _find_first_bit_be(p,sz)
-#define find_next_bit(p,sz,off) _find_next_bit_be(p,sz,off)
-
-#endif

#if __LINUX_ARM_ARCH__ < 5

diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index a88671c..22e8748 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -146,17 +146,6 @@ EXPORT_SYMBOL(_clear_bit);
EXPORT_SYMBOL(_test_and_clear_bit);
EXPORT_SYMBOL(_change_bit);
EXPORT_SYMBOL(_test_and_change_bit);
-EXPORT_SYMBOL(_find_first_zero_bit_le);
-EXPORT_SYMBOL(_find_next_zero_bit_le);
-EXPORT_SYMBOL(_find_first_bit_le);
-EXPORT_SYMBOL(_find_next_bit_le);
-
-#ifdef __ARMEB__
-EXPORT_SYMBOL(_find_first_zero_bit_be);
-EXPORT_SYMBOL(_find_next_zero_bit_be);
-EXPORT_SYMBOL(_find_first_bit_be);
-EXPORT_SYMBOL(_find_next_bit_be);
-#endif

#ifdef CONFIG_FUNCTION_TRACER
#ifdef CONFIG_OLD_MCOUNT
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 0573faa..de369aa 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -6,7 +6,7 @@

lib-y := backtrace.o changebit.o csumipv6.o csumpartial.o \
csumpartialcopy.o csumpartialcopyuser.o clearbit.o \
- delay.o delay-loop.o findbit.o memchr.o memcpy.o \
+ delay.o delay-loop.o memchr.o memcpy.o \
memmove.o memset.o memzero.o setbit.o \
strchr.o strrchr.o \
testchangebit.o testclearbit.o testsetbit.o \
diff --git a/include/asm-generic/bitops/le.h b/include/asm-generic/bitops/le.h
index 6173154..9a8798f 100644
--- a/include/asm-generic/bitops/le.h
+++ b/include/asm-generic/bitops/le.h
@@ -2,6 +2,7 @@
#define _ASM_GENERIC_BITOPS_LE_H_

#include <asm/types.h>
+#include <asm-generic/bitops/find.h>
#include <asm/byteorder.h>

#if defined(__LITTLE_ENDIAN)
--
1.9.1

Thanks a lot to George Spelvin and Rasmus Villemoes for hints and
helpful discussions.

v4:
* find_last_bit found buggy and replaced with George Spelvin's version,
which handles garbage at the end of word-unaligned bitmap correctly;
* find_last_bit description fixed in 'include/linux/bitops.h';
* '_find_next_bit{,_le}: first word handling turned to be unconditional;

v3:
* conditional bit inverting in _find_next_bit replaced with XORing by
mask (patch 1);
* size/offset checkers moved from wrappers to _find_next_bit (patch 1);
* #include <linux/bitops.h> restored (patch 1);
* lib/find_next_bit descriptor changed to not appeal to file name and
so let to produce clean diff in case of rename (patch 2);
* patch 3 is generated with '-M' option to detect renaming and drop
file content;
* this cover added.

Yury Norov (3):
lib: find_*_bit reimplementation
lib: move find_last_bit to lib/find_next_bit.c
lib: rename lib/find_next_bit.c to lib/find_bit.c

include/linux/bitops.h | 4 +-
lib/Makefile | 2 +-
lib/find_bit.c | 195 +++++++++++++++++++++++++++++++++
lib/find_last_bit.c | 49 ---------
lib/find_next_bit.c | 285 -------------------------------------------------
5 files changed, 198 insertions(+), 337 deletions(-)
create mode 100644 lib/find_bit.c
delete mode 100644 lib/find_last_bit.c
delete mode 100644 lib/find_next_bit.c

--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/