deprecated.rst: deprecated strcpy ? (was: [PATCH] checkpatch: add a new check for strcpy/strlcpy uses)

From: Joe Perches
Date: Tue Jan 05 2021 - 04:29:20 EST


On Tue, 2021-01-05 at 14:29 +0530, Dwaipayan Ray wrote:
> On Tue, Jan 5, 2021 at 2:14 PM Joe Perches <joe@xxxxxxxxxxx> wrote:
> >
> > On Tue, 2021-01-05 at 13:53 +0530, Dwaipayan Ray wrote:
> > > strcpy() performs no bounds checking on the destination buffer.
> > > This could result in linear overflows beyond the end of the buffer.
> > >
> > > strlcpy() reads the entire source buffer first. This read
> > > may exceed the destination size limit. This can be both inefficient
> > > and lead to linear read overflows.
> > >
> > > The safe replacement to both of these is to use strscpy() instead.
> > > Add a new checkpatch warning which alerts the user on finding usage of
> > > strcpy() or strlcpy().
> >
> > I do not believe that strscpy is preferred over strcpy.
> >
> > When the size of the output buffer is known to be larger
> > than the input, strcpy is faster.
> >
> > There are about 2k uses of strcpy.
> > Is there a use where strcpy use actually matters?
> > I don't know offhand...
> >
> > But I believe compilers do not optimize away the uses of strscpy
> > to a simple memcpy like they do for strcpy with a const from
> >
> >         strcpy(foo, "bar");
> >
>
> Yes the optimization here definitely helps. So in case the programmer
> knows that the destination buffer is always larger, then strcpy() should be
> preferred? I think the documentation might have been too strict about
> strcpy() uses here:
>
> Documentation/process/deprecated.rst:
> "strcpy() performs no bounds checking on the destination buffer. This
> could result in linear overflows beyond the end of the buffer, leading to
> all kinds of misbehaviors. While `CONFIG_FORTIFY_SOURCE=y` and various
> compiler flags help reduce the risk of using this function, there is
> no good reason to add new uses of this function. The safe replacement
> is strscpy(),..."

Kees/Jonathan:

Perhaps this text is overly restrictive.

There are ~2k uses of strcpy in the kernel.

About half of these are where the buffer length of foo is known and the
use is 'strcpy(foo, "bar")' so the compiler converts/optimizes away the
strcpy to memcpy and may not even put "bar" into the string table.

I believe strscpy uses do not have this optimization.

Is there a case where the runtime costs actually matters?
I expect so.