Re: alternatives to null-terminated byte arrays in syscalls in the future?

From: Andrew Kelley
Date: Fri Apr 08 2016 - 17:22:14 EST


On Fri, Apr 8, 2016 at 2:10 PM, Denys Vlasenko <vda.linux@xxxxxxxxxxxxxx> wrote:
> On Fri, Apr 8, 2016 at 11:04 PM, Andrew Kelley <superjoe30@xxxxxxxxx> wrote:
>> The open syscall looks like this:
>>
>> SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
>>
>> filename is a null terminated byte array. Null termination is one way
>> to handle lengths of byte arrays, but arguably a better way is to keep
>> track of the length in a separate field. Many programming languages
>> use pointer + length instead of null termination for various reasons.
>>
>> When it's time to make a syscall such as open, software which does not
>> have a null character at the end of byte arrays are forced to allocate
>> memory, do a memcpy, insert a null byte, perform the open syscall,
>> then deallocate the memory.
>
> In many cases, it's possible to just add the NUL byte instead.

Counter example, the Rust standard library:
https://github.com/rust-lang/rust/blob/7e996943784dcbabed433b6906510298ad80903b/src/libstd/sys/unix/fs.rs#L420-L423
https://github.com/rust-lang/rust/blob/7e996943784dcbabed433b6906510298ad80903b/src/libstd/sys/unix/fs.rs#L534-L536

The problem is that the open syscall is low level in a given
application so is usually abstracted in a way where having space to
add the NUL byte is not guaranteed, so implementations have to take
the safe bet of copying memory.

>
>> What are the chances that in the future, Linux will have alternate
>> syscalls which accept byte array parameters where one can pass the
>> length of the byte array explicitly instead of using a null byte?
>
> 0% chances. Amount of PITA to make that happen far outweighs
> possible benefits.

OK, fair enough. If I proposed a patch to the mailing list, would that
change the chances at all?