Re: [PATCHv4 RESEND 0/3] syscalls,x86: Add execveat() system call

From: Al Viro
Date: Sun Oct 19 2014 - 18:43:09 EST


On Sun, Oct 19, 2014 at 03:16:03PM -0700, Andy Lutomirski wrote:

> Oh, you mean that #!/usr/bin/make -f would turn into /usr/bin/make
> /dev/fd/3? That could be interesting, although I can imagine it
> breaking things, especially if /dev/fd/3 isn't set up like that, e.g.
> early in boot.

Sigh... What I mean is that fexecve(fd, ...) would have to put _something_
into argv when it execs the interpreter of #! file. Simply because the
interpreter (which can be anything whatsoever) has no fscking idea what
to do with some descriptor it has before execve(). Hell, it doesn't have
any idea *which* descriptor had it been.

You need to put some pathname that would yield your script upon open(2).
If you bothered to read those patches, you'd see that they do supply
one, generating it with d_path(). Which isn't particulary reliable.

I'm not sure there's any point putting any of that in the kernel - if
you *do* have that pathname, you can just pass it.

> Aside from the general scariness of allowing one process to actually
> dup another process's fds, I feel like this is asking for trouble wrt
> the various types of file locks.

Who said anything about another process's fds? That, indeed, would be
a recipe for serious trouble. It's a filesystem with one directory,
not with one directory for each process...

FWIW, they (Plan 9) do have procfs and there they have /proc/<pid>/fd.
Which is a regular file, with contents consisting of \n-terminated
lines (one per descriptor in <pid>'s descriptor table>) in the same
format as in *ctl (they put descriptor number as the first field in
those).

Unlike our solution, they do not allow to get to any process' files via
procfs. They do allow /dev/stdin-style access to your own files via
dupfs. And yes, for /dev/stdin and friends dup-style semantics is better -
you get consistent behaviours for pipes and redirects from file that way.
See the example I've posted upthread. Besides, for things like sockets
our semantics simply fails - they really depend on having only one
struct file for given socket; it's dup or nothing there. The same goes
for a lot of things like eventfd, etc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/