linux interprets an fcntl int arg as long

From: Szabolcs Nagy
Date: Mon Oct 31 2022 - 08:46:45 EST


in short, F_ADD_SEALS fcntl cmd is documented to take int arg,
but linux mm/memfd.c has

switch (cmd) {
case F_ADD_SEALS:
/* disallow upper 32bit */
if (arg > UINT_MAX)
return -EINVAL;

fcntl is variadic:

int fcntl(int fd, int cmd, ... /* arg */);

and arg is either int or pointer in the current documentation.
the libc does not know which (except for existing commands,
but there can be future extensions).

so glibc just assumes pointer arg and passes it down to the
kernel. musl uses unsigned long arg, but either way depending
on the vararg abi rules of the target the top bits of an int
arg can be non-zero when passed to the kernel. (in principle
it could crash too: variadic args only supposed to work when
the type is right, but in existing abis this does not seem to
be a problem.)

e.g. the following fails with EINVAL:

int fd = memfd_create("test", MFD_CLOEXEC|MFD_ALLOW_SEALING);
int r = fcntl(fd, F_ADD_SEALS, F_SEAL_WRITE | 0xF00000000);

and such fcntl call can happen with c code that just passes
F_SEAL_WRITE since it is an int and e.g. with aarch64 pcs rules
it is passed in a register where top bits can be non-zero
(unlikely in practice but valid).

i think there are general issues with variadic syscalls:

1) existing varargs that are documented as int should be
interpret as int on the kernel side (the libc cannot fix this
up reliably). i.e. linux should fix the F_ADD_SEALS code.

2) variadic args ideally should not have mixed type and the
type that is used for future extensions should be documented.

3) macro defines for varargs that are long should have type
long and not int in the uapi. (affects some prctl flags)
otherwise users must cast manually to the right type.

thanks.