[PATCH v2 0/2] Clean up and shrink uuid input & output

From: George Spelvin
Date: Sat Jun 04 2016 - 01:13:15 EST

Okay, here's a more formal submission.

Net code size changes for both patches:

uuid_string() __uuid_to_bin()
Before After Delta Percent Before After Delta Percent
x86-32 245 196 -49 -20.0% 122 90 -32 -26.2%
x86-64 246 162 -84 -34.1% 127 90 -37 -29.1%
arm 292 216 -76 -26.0% 116 92 -24 -20.7%
thumb 220 136 -84 -38.2% 100 50 -50 -50.0%
arm64 296 188 -108 -36.5% 148 116 -32 -21.6%

I haven't measured the speedup because, although it's obviously faster,
speed is not particularly important. (I wouldn't be using out-of-line
hex2bin() if it were.)

Andy Shevchenko wrote
> Be honest, you increase rodata for almost the same amount of bytes.

No, my original patch didn't increase rodata at all, because it replaced
tables of the same size.

Bringing that patch forward in the simplest way (the first copy, which
I sent you privately) broke the sharing of the arrays with lib/uuid.c
and thereby increased rodata by 32 bytes.

But patching lib/uuid.c to share the new tables as is done in this
patch series ends up with a net 16-byte reduction in rodata.

> Also, please fix Cc list (I got bounce response) and add some key people
> like Rasmus.

Mea culpa. I manually copied the e-mail addresses and managed to typo
"bjorn" into "bbjorn".

>> The arrays are combined in one contiguous uuid_byte_pos[2][16]
>> array as a micro-optimization for uuid_string().

> Oh, it makes readability worse.

It's truly marginal, deleted.

> How hex2bin is better here? We have validation done, no need to repeat.
> So, I suggest not to touch this piece of code.

hex2bin is better than hex_to_bin because it's does more of what we want
in one call rather than needing two.

As for checking the return value, it's a longish story.
However, at your prompting, I've deleted it (more size savings!).

Now, the story...

Originally, I had eliminated the uuid_is_valid() call entirely, and
just checked that the hyphens were in the right place and the digits
all converted:

static int __uuid_to_bin(const char *uuid, __u8 b[16], const u8 si[16])
unsigned int i;

if (uuid[ 8] != '-' || uuid[13] != '-' ||
uuid[18] != '-' || uuid[23] != '-')
return -EINVAL;

for (i = 0; i < 16; i++)
if (hex2bin(b + i, uuid + si[i], 1) < 0)
return -EINVAL;

return 0;

... but I was worried that some callers might be passing in an arbitrary
null-terminated string, and reading uuid[23] before checking that the
preceding characters were all non-null might be a bad thing.

So I put the uuid_is_valid() call back.

However, I left the second return value check, on general principles,
because it was cheap and I wasn't sure it wasn't useful.

i was thinking about about time-of-check to time-of-use race conditions.
I had just determined that I didn't know the callers and what they were
doing, so I wasn't sure the source uuid would be stable for the duration
of the call.

However, this isn't a real TOCTTOU security bug. The only thing that
corrupting the buffer can do is insert a 0xff byte in the uuid. Which
is nothing that couldn't be done with a static input.

So it's unnecessary.

George Spelvin (2):
lib/vsprintf.c: Simplify uuid_string()
lib/uuid.c: eliminate uuid_[bl]e_index arrays

include/linux/uuid.h | 6 ++++--
lib/uuid.c | 24 ++++++++++--------------
lib/vsprintf.c | 40 +++++++++++++++++-----------------------
3 files changed, 31 insertions(+), 39 deletions(-)