Re: [PATCH] string: Improve the generic strlcpy() implementation

From: Ingo Molnar
Date: Tue Oct 06 2015 - 03:54:27 EST



* Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> wrote:

> On Mon, Oct 05 2015, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> > * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> >> So I finally pulled it. I like the patch, I like the new interface,
> >> but despite that I wasn't really sure if I wanted to pull it in - thus
> >> the long delay of me just seeing this in my list of pending pulls for
> >> almost a month, but never really getting to the point where I decided
> >> I want to commit to it.
> >
> > Interesting. I noticed that strscpy() says this in its comments:
> >
> > * In addition, the implementation is robust to the string changing out
> > * from underneath it, unlike the current strlcpy() implementation.
> >
> > The strscpy() interface is very nice, but shouldn't we also fix this strlcpy()
> > unrobustness/race it refers to, in light of the 2000+ existing strlcpy() call
> > sites?
>
> How about every single occurence of %s in a format string? vsnprintf
> also has that "issue", but has it actually ever been a problem? The
> window for something bad to happen is probably also much larger in the
> printf case, and especially when when some %p extension is used and/or
> the vsnprintf user is kasprintf() (where we 'replay' the formatting,
> having hopefully obtained a correct-sized buffer).
>
> In fact, writing this, it occurs to me that we should probably check the
> return value of the second vsnprintf call in kasprintf and compare to
> the first, issuing a warning if they don't match.
>
> I'm not against making strlcpy more robust, but I think the theoretical
> race is far more likely to manifest through a member of the printf
> family.
>
> Note that, unless one cares for performance or worries about 2G+
> length strings, strlcpy could just be 'return snprintf(dst, len, "%s",
> src);', which would give the "check for insanely large/negative len" for
> free [though not giving strlen(src) as return value - but the caller is much
> more likely to be tripped up by no copying having taken place anyway].
>
> > Another problem is that strlcpy() will also happily do bad stuff if we pass
> > it a negative size. Instead of that we will from now on print a (one time)
> > warning and return safely.
>
> Well, not too sure about that 'safely'. If the caller somehow managed to compute
> an insanely large (remaining) capacity in the buffer and has that in a size_t
> variable, then proceeds to comparing the return value to the supposed buffer
> size to check for overflow, he will think that everything is fine and proceed to
> using likely uninitialized contents of his buffer.
>
> I think a return value of 0 might be slightly better. Assuming the caller has
> the capacity in a signed variable (so it only became huge by being converted to
> size_t) and makes a signed comparison with the return value, both 0 and strlen()
> triggers an overflow check, so we wouldn't be worse off in that case. Clearly
> the same is true if the return value is not used at all. If the return value is
> used mindlessly for advancing dst and decrementing the capacity, staying put is
> probably better.

Ok, I can certainly change the return value to 0, but note that the (insane!)
return value of strlcpy() gets used in only about 0.8% of the cases:

triton:~/tip> git grep -w strlcpy | wc -l
2097
triton:~/tip> git grep -w strlcpy | grep -w if | wc -l
11

... so this all is pretty theoretical I think, and we could as well just migrate
all those standalone strlcpy() users that don't check the return code over to
strscpy()!

This would probably speed up all those usecases, so it's a nice optimization.

Then we could convert the remaining ~20 call sites and mark strlcpy() as a working
but deprecated API.

Linus, would you object to such patches, if it's done in a relatively painless
fashion: not propagated into -next but generated automatically late in the merge
window or right after -rc1 or so.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/