Re: [PATCH] string: Improve the generic strlcpy() implementation

From: Ingo Molnar
Date: Thu Oct 08 2015 - 04:48:50 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> So I really refuse to worry about the snprintf() family of functions wrt this
> race. I don't think it was hugely important for strlcpy() either - more of a
> "quality of implementation" issue rather than anything fundamental - but for
> snprintf and friends it's an almost unavoidable issue because of how snprintf
> works.
>
> Saying that 'strlcpy()' and 'snprintf("%s")' are equivalent is true only in the
> loosest sense. Yes, they return the same return value. Yes, the result string
> should be the same. But the two are completely different despite that.
>
> snprintf() has to handle all the *other* cases than just "%s", including
> right-justification, string precision handling, etc etc. It is effectively
> impossible to do without doing "strlen()" on the source of the string
> beforehand. As a result, snprintf() is fundamentally always going to be racy wrt
> the string changing during the call.
>
> So the simple end result is that we shouldn't worry about it, and if you are
> doing snprintf() on a changing string, you should just be aware of it. We *do*
> actually do that, for things like "current->comm" that really can change while
> being printed out. We just don't care deeply, and have in fact been removing
> locks in this area, because the end result is still guaranteed to be
> NUL-terminated etc.
>
> Can we get odd truncated printouts in the (very very very unlikely) case that
> the string is being changed? Yes. We just don't care.

I do agree mostly, but I think we should still try to achieve the following two
properties, if possible sanely+cheaply+cleanly:

- the printed string should not contain spurious \0 bytes even if the %s source
'races'. [I think this is true currently.]

- the return code should correctly represent what snprintf did to the target
string. [This might not be the case currently. But I'm not sure!]

Because that's a real concern I think: snprintf() return is used frequently to
iterate over buffers, and it should correctly and reliably represent what it did,
regardless of what the source buffer does - because snprintf obviously knows what
it did to the output buffer, it has full, race-free control over it.

Whether left-alignment and other formatting details were calculated correctly,
etc. is a secondary concern and cannot be guaranteed, but we should at least
guarantee that we generated a single string, that we did nothing else, and that we
correctly returned its length.

Agreed?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/