Re: [PATCH 1/2] lib: hexdump: use a look-up table to do hex_to_bin

From: Michal Nazarewicz
Date: Thu Jun 30 2016 - 17:18:44 EST


On Thu, Jun 30 2016, Joe Perches wrote:
> On Wed, 2016-06-29 at 21:52 +0300, Andy Shevchenko wrote:
>> On Wed, 2016-06-29 at 20:31 +0200, Michal Nazarewicz wrote:
> []
>> > tolower macro maps to __tolower function which calls isupper to
>> > to determine if character is an upper case letter before converting
>> > it to lower case.ÂÂThis preservers non-letters unchanged which is
>> > what you want in usual case.
>> >
>> > However, hex_to_bin does not care about non-letter characters so
>> > such conversion can be performed as long as (i) upper case letters
>> > become lower case, (ii) lower case letters are unchanged and (iii)
>> > non-letters stay non-letters.
>> >
>> > This is exactly what _tolower function does and using it makes it
>> > possible to avoid _ctype table lookup performed by the isupper
>> > table.
>> >
>> > Furthermore, since _tolower conversion is done unconditionally, this
>> > also eliminates a single branch.
>> This change I agree with since _tolower() is specific for lib internal
>> usage in the kernel.
>
> Perhaps _tolower should be used a bit more in lib
> ---
> Âlib/string.c | 8 ++++----
> Â1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/lib/string.c b/lib/string.c
> index ed83562..b0e72fd 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -53,8 +53,8 @@ int strncasecmp(const char *s1, const char *s2, size_t len)
> Â break;
> Â if (c1 == c2)
> Â continue;
> - c1 = tolower(c1);
> - c2 = tolower(c2);
> + c1 = _tolower(c1);
> + c2 = _tolower(c2);

That wonât work. If someone really wanted, we probably could get away
with:

bool strneq(const char *s1, const char *s2, size_t len)
{
/* Yes, Virginia, it had better be unsigned */
unsigned char c1, c2, x;

if (!len)
return true;

do {
c1 = *s1++;
c2 = *s2++;
if (!c1 || !c2)
break;
x = c1 ^ c2;
if (x && (x != 0x20 || !isalpha(c1) ||
_tolower(c1) != _tolower(c2)))
return false;
} while (--len);
return c1 == c2;
}

I didnât find any uses of strncasecmp where the result isnât simply used
as a boolean equal/non-equal test. This is a bigger undertaking though.

We could try doing this though:

---
include/linux/ctype.h | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/ctype.h b/include/linux/ctype.h
index 653589e..b1ef461 100644
--- a/include/linux/ctype.h
+++ b/include/linux/ctype.h
@@ -35,17 +35,19 @@ extern const unsigned char _ctype[];
#define isascii(c) (((unsigned char)(c))<=0x7f)
#define toascii(c) (((unsigned char)(c))&0x7f)

+#define _CTYPE_LOWER_BIT 0x20 /* bit determining if letter is lower case */
+
static inline unsigned char __tolower(unsigned char c)
{
if (isupper(c))
- c -= 'A'-'a';
+ c |= _CTYPE_LOWER_BIT;
return c;
}

static inline unsigned char __toupper(unsigned char c)
{
if (islower(c))
- c -= 'a'-'A';
+ c &= ~_CTYPE_LOWER_BIT;
return c;
}

@@ -58,7 +60,7 @@ static inline unsigned char __toupper(unsigned char c)
*/
static inline char _tolower(const char c)
{
- return c | 0x20;
+ return c | _CTYPE_LOWER_BIT;
}

/* Fast check for octal digit */
--
2.8.0.rc3.226.g39d4020

but whether itâs actually faster on modern hardware, I have no idea.
Similarly, a lot of âfoo - '0'â could be replaced by âfoo & 0xfâ, but
this again is a bigger undertaking.

> Â if (c1 != c2)
> Â break;
> Â } while (--len);
> @@ -69,8 +69,8 @@ int strcasecmp(const char *s1, const char *s2)
> Â int c1, c2;
> Â
> Â do {
> - c1 = tolower(*s1++);
> - c2 = tolower(*s2++);
> + c1 = _tolower(*s1++);
> + c2 = _tolower(*s2++);
> Â } while (c1 == c2 && c1 != 0);
> Â return c1 - c2;
> Â}
>
>

--
Best regards
ããã âðððð86â ãããããã
ÂIf at first you donât succeed, give up skydivingÂ