Re: [PATCH next] string: Optimise strlen()

From: David Laight

Date: Sat Mar 28 2026 - 17:47:56 EST


On Sat, 28 Mar 2026 12:16:52 -0700
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Sat, 28 Mar 2026 at 04:08, David Laight <david.laight.linux@xxxxxxxxx> wrote:
> >
> > On Fri, 27 Mar 2026 17:29:21 -0700
> > Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > The trivial cases don't even matter, because all the cost of execve()
> > > are elsewhere for those cases.
> > >
> > > But the cases where the strings *do* matter, they are many and long.
> >
> > Is that the strncpy_from_user() path?
>
> No. For annoying reasons, execve() mainly uses "strnlen_user()"
> followed by "copy_from_user()".
>
> See fs/exec.v: copy_strings().
>
> The reason is that it needs to know the size of the string before it
> can start copying it, because the destination address will depend on
> it.
>
> And yes, it's racy, and yes, if y ou modify the arguments or the
> environment while an exevbe() is going on, you get what you deserve
> (but it's not a security issue, it's just a "resulting argv[] array is
> odd", but you could have made it odd in the first place, so whatever).
>
> It would be lovely to be able to od it in one go and not walk the
> source string twice, but that's sadly not how the execve() interface
> works (or somebody would need to come up with a clever trick).

That sounds like a challenge :-)

>
> The main user of strncpy_from_user() is the path copying: see the
> 'getname' variations in fs/namei.c.
>
> And sometimes pathnames are short, but we had a semi-recent discussion
> about the distribution of pathname lengths due to some allocation
> optimizations recently:
>
> https://lore.kernel.org/all/CAGudoHEMjWCOLEp+TdKLjuguHEKn9+e+aZwfKyK_sYpTZY8HRg@xxxxxxxxxxxxxx/
>
> so while short names are common, longer names aren't *uncommon*, and
> and loads that use them tend to keep using them.
>
> We ended up aiming for ~128 bytes for the initial allocation
> (EMBEDDED_NAME_MAX is 168 in one common config) for that reason.
>
> Don't get me wrong: there are certainly many other users of
> strnlen_user() and strncpy_from_user(), but the ones I've seen in any
> half-way normal loads are those two: execve() and pathname copying.
>
> > I started looking at this because someone was trying to write the 'bit-masking'
> > version for (possibly) RISC-V and I deciding that they weren't making a good
> > job of it and that it probably wasn't worth while (since x86-64 just uses
> > the byte code).
>
> Ok.
>
> I do think that in user space, strlen() and friends can be absolutely
> critical for some loads, because the C string model is horrible.
>
> But in the kernel, I really don't think any of this matters. Our
> strlen() is bad not because it's bad - it's bad because nobody really
> should *care*.

You've said that before - which is why I dissuaded the RISC-V people
from writing a cache-destroying strlen().
Actually strscpy() is also optimised for long strings - that is now
being used all over the place (I think/hope most constant strings get
converted to memcpy()), I suspect the typical length is 10 bytes!
That probably wants de-optimising :-)

The 'one size fits all' for the string functions doesn't help.
If the source contained a constant 'hint' for a typical size then
an appropriate algorithm could be picked.

> Some of our "rep scas" users have been kept around exactly because
> absolutely nobody cares, and it's a cute remnant of a very naive young
> Linus who was using them because he was trying to learn things about
> his new i80386 CPU, and started a whole small hobby project as a
> result...

When you wrote those they weren't that bad at all.
The 386 book has 'rep scas' as 5+8n (as does the 286 book) - much faster
than the equivalent instructions.
I am surprised they survived the P4.

In 1990 I was writing driver code for multi-cpu sparc systems (not solaris)
and worrying about TSO and the store buffer.
(and seeing the cpu stall for 150 clocks while the mmu did a page table walk.)

David


>
> Linus