Re: [PATCH] string: Improve the generic strlcpy() implementation

From: Rasmus Villemoes
Date: Mon Oct 05 2015 - 18:29:01 EST


On Mon, Oct 05 2015, Ingo Molnar <mingo@xxxxxxxxxx> wrote:

> * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>> So I finally pulled it. I like the patch, I like the new interface,
>> but despite that I wasn't really sure if I wanted to pull it in - thus
>> the long delay of me just seeing this in my list of pending pulls for
>> almost a month, but never really getting to the point where I decided
>> I want to commit to it.
>
> Interesting. I noticed that strscpy() says this in its comments:
>
> * In addition, the implementation is robust to the string changing out
> * from underneath it, unlike the current strlcpy() implementation.
>
> The strscpy() interface is very nice, but shouldn't we also fix this strlcpy()
> unrobustness/race it refers to, in light of the 2000+ existing strlcpy() call
> sites?

How about every single occurence of %s in a format string? vsnprintf
also has that "issue", but has it actually ever been a problem? The
window for something bad to happen is probably also much larger in the
printf case, and especially when when some %p extension is used and/or
the vsnprintf user is kasprintf() (where we 'replay' the formatting,
having hopefully obtained a correct-sized buffer).

In fact, writing this, it occurs to me that we should probably check the
return value of the second vsnprintf call in kasprintf and compare to
the first, issuing a warning if they don't match.

I'm not against making strlcpy more robust, but I think the theoretical
race is far more likely to manifest through a member of the printf
family.

Note that, unless one cares for performance or worries about 2G+
length strings, strlcpy could just be 'return snprintf(dst, len, "%s",
src);', which would give the "check for insanely large/negative len" for
free [though not giving strlen(src) as return value - but the caller is much
more likely to be tripped up by no copying having taken place anyway].

> Another problem is that strlcpy() will also happily do bad stuff if we pass
> it a negative size. Instead of that we will from now on print a (one time)
> warning and return safely.

Well, not too sure about that 'safely'. If the caller somehow managed to
compute an insanely large (remaining) capacity in the buffer and has
that in a size_t variable, then proceeds to comparing the return
value to the supposed buffer size to check for overflow, he will think
that everything is fine and proceed to using likely uninitialized
contents of his buffer.

I think a return value of 0 might be slightly better. Assuming the
caller has the capacity in a signed variable (so it only became huge by
being converted to size_t) and makes a signed comparison with the return
value, both 0 and strlen() triggers an overflow check, so we wouldn't be
worse off in that case. Clearly the same is true if the return value is
not used at all. If the return value is used mindlessly for advancing
dst and decrementing the capacity, staying put is probably better.

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/