[PATCH 0/2] Optimize memchr()

From: Yu-Jen Chang
Date: Sun Jul 10 2022 - 10:28:45 EST


*** BLURB HERE ***
This patche series optimized "memchr()" and add a macro for
"memchr_inv()" so that both funtions can use it to generate bit mask.

The original implementaion of "memchr()" is based on byte-wise comparison,
which do not fully use 64-bit or 32-bit register in CPU. We implement a
word-wise comparison so that at least 4 bytes can be compared at the same
time. The optimized "memchr()" is nearly 4x faster than the original one
for long strings. In Linux Kernel, we find that the length of the string
searched by "memchr()" is up to 512 bytes in drivers/misc/lkdtm/heap.c.
In our test, the optimized version is about 20% faster if the target
character is at the end of the string when going through a 512-byte
string.

We recompile the 5.18 kernel with optimized "memchr()" in 32-bit and
64-bit. They run correctly.

Yu-Jen Chang (2):
lib/string.c: Add a macro for memchr_inv()
lib/string.c: Optimize memchr()

lib/string.c | 62 ++++++++++++++++++++++++++++++++++++++--------------
1 file changed, 45 insertions(+), 17 deletions(-)

--
2.25.1