Re: [PATCH RESEND v11 7/8] open: openat2(2) syscall

From: Aleksa Sarai
Date: Thu Aug 29 2019 - 09:19:42 EST


On 2019-08-29, Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> wrote:
> On 29/08/2019 14.15, Aleksa Sarai wrote:
> > On 2019-08-24, Daniel Colascione <dancol@xxxxxxxxxx> wrote:
>
> >> Why pad the structure when new functionality (perhaps accommodated via
> >> a larger structure) could be signaled by passing a new flag? Adding
> >> reserved fields to a structure with a size embedded in the ABI makes a
> >> lot of sense --- e.g., pthread_mutex_t can't grow. But this structure
> >> can grow, so the reservation seems needless to me.
> >
> > Quite a few folks have said that ->reserved is either unnecessary or
> > too big. I will be changing this, though I am not clear what the best
> > way of extending the structure is. If anyone has a strong opinion on
> > this (or an alternative to the ones listed below), please chime in. I
> > don't have any really strong attachment to this aspect of the API.
> >
> > There appear to be a few ways we can do it (that all have precedence
> > with other syscalls):
> >
> > 1. Use O_* flags to indicate extensions.
> > 2. A separate "version" field that is incremented when we change.
> > 3. Add a size_t argument to openat2(2).
> > 4. Reserve space (as in this patchset).
> >
> > (My personal preference would be (3), followed closely by (2).)
>
> 3, definitely, and instead of having to invent a new scheme for every
> new syscall, make that the default pattern by providing a helper

Sure (though hopefully I don't need to immediately go and refactor all
the existing size_t syscalls). I will be presenting about this patchset
at the containers microconference at LPC (in a few weeks), so I'll hold
of on any API-related rewrites until after that.

> int __copy_abi_struct(void *kernel, size_t ksize, const void __user
> *user, size_t usize)
> {
> size_t copy = min(ksize, usize);
>
> if (copy_from_user(kernel, user, copy))
> return -EFAULT;
>
> if (usize > ksize) {
> /* maybe a separate "return user_is_zero(user + ksize, usize -
> ksize);" helper */
> char c;
> user += ksize;
> usize -= ksize;
> while (usize--) {
> if (get_user(c, user++))
> return -EFAULT;
> if (c)
> return -EINVAL;

This part would probably be better done with memchr_inv() and
copy_from_user() (and probably should put an upper limit on usize), but
I get what you mean.

> }
> } else if (ksize > usize) {
> memset(kernel + usize, 0, ksize - usize);
> }
> return 0;
> }
> #define copy_abi_struct(kernel, user, usize) \
> __copy_abi_struct(kernel, sizeof(*kernel), user, usize)
>
> > Both (1) and (2) have the problem that the "struct version" is inside
> > the struct so we'd need to copy_from_user() twice. This isn't the end of
> > the world, it just feels a bit less clean than is ideal. (3) fixes that
> > problem, at the cost of making the API slightly more cumbersome to use
> > directly (though again glibc could wrap that away).
>
> I don't see how 3 is cumbersome to use directly. Userspace code does
> struct openat_of_the_day args = {.field1 = x, .field3 = y} and passes
> &args, sizeof(args). What does glibc need to do beyond its usual munging
> of the userspace ABI registers to the syscall ABI registers?

I'd argue that

ret = openat2(AT_FDCWD, "foo", &how, sizeof(how)); // (3)

is slightly less pretty than

ret = openat2(AT_FDCWD, "foo", &how); // (1), (2), (4)

But it's not really that bad. Forget I said anything.

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: signature.asc
Description: PGP signature