On Tue, 9 Dec 2008, Casey Dahlin wrote:
This is essentially my first kernel patch, so be nice :)
Linux already has signalfd, timerfd, and eventfd to expose signals, timers,
and events via a file descriptor. This patch is a working prototype for a
fourth: waitfd. It pretty much does what the name suggests: reading from it
yields a series of status ints (as would be written into the second argument
of waitpid) for child processes that have changed state. It takes essentially
the same arguments as waitpid (for now) and supports the same set of features.
This is far from ready for merge. Some things that are wrong with it:
* Waitpid's argument scheme probably isn't the best for this. By default it
makes it easiest to wait on a single child, which is not often useful in this
case. Waiting on all children or children in a particular process group is
possible, but not a particular, explicit set of children, which we probably
want (and which will complicate the implementation significantly).
* The prototype for peek_waitpid is obviously in the wrong place, but I
haven't found a good home for it.
* Waitid's semantics have slightly altered: passing NULL as the pointer to
siginfo_t with WNOWAIT will now return successfully instead of throwing
EFAULT. I don't know if that means I broke it or fixed it :)
* peek_waitpid may not be required at all now. I can probably trick sys_wait4
or sys_waitid into giving me what I want (or I could always just make do_wait
non-static).
Please provide thoughts.
What's wrong in having a signalfd on SIGCHLD, than doing waitpid() once you get the signal? Do you have cases where this wouldn't be OK?
About the code, eventually, you really want to report the PID of the exited child, together with the status. So maybe a non-compat-requiring struct would be better to be returned by read(2).
Also ...
+static unsigned int waitfd_poll(struct file *file, poll_table *wait)
+{
+ struct waitfd_ctx *ctx = file->private_data;
+ long value;
+
+ poll_wait(file, ¤t->signal->wait_chldexit, wait);
+
+ value = peek_waitpid(ctx->upid, ctx->ops);
+ if (value > 0) {
+ return POLLIN;
+ } if (value == -ECHILD) {
+ return POLLIN;
+ }
Trust the compiler, it's pretty good in not having you to add Perl-like extra brackets ;)
This also looks wierd:
} if (value == -ECHILD) {
So maybe:
return value > 0 || value == -ECHILD ? POLLIN: 0;
+/*
+ * Returns a multiple of the size of a "struct waitfd_siginfo", or a negative
+ * error code. The "count" parameter must be at least the size of a
+ * "struct waitfd_siginfo".
+ */
Really? ...
More leftovers from above.+static ssize_t waitfd_read(struct file *file, char __user *buf, size_t count,
+ loff_t *ppos)
+{
+ struct waitfd_ctx *ctx = file->private_data;
+ int __user *stat_addr = (int *)buf;
+ int nonblock = file->f_flags & O_NONBLOCK ? WNOHANG: 0;
+ ssize_t ret, total = 0;
+
+ count /= sizeof(int);
+ if (!count)
+ return -EINVAL;
+
+ do {
+ ret = sys_wait4(ctx->upid, stat_addr, ctx->ops | nonblock,
+ NULL);
+ if (ret == 0)
+ ret = -EAGAIN;
+ if (ret == -ECHILD)
+ ret = 0;
+ if (ret <= 0)
+ break;
+
+ stat_addr++;
+ total += sizeof(struct siginfo);
+ nonblock = WNOHANG;
+ } while (--count);
+
+ return total ? total: ret;
+}
... looks like you're returning a sequence of status ints, with wrong data size returned by read(2).
ack'd.
+
+static const struct file_operations waitfd_fops = {
+ .release = waitfd_release,
+ .poll = waitfd_poll,
+ .read = waitfd_read,
+};
+
+asmlinkage long sys_waitfd(pid_t upid, int ops)
Please leave space for extra flags for fds, otherwise Uli will have to make another sys_waitfd3().
+long peek_waitpid(pid_t upid, int options)
+{
+ struct pid *pid = NULL;
+ enum pid_type type;
+ long ret;
+
+ if (options & ~(WNOHANG|WUNTRACED|WCONTINUED|
+ __WNOTHREAD|__WCLONE|__WALL))
+ return -EINVAL;
+
+ options |= WNOHANG | WNOWAIT;
+
+ if (upid == -1)
+ type = PIDTYPE_MAX;
+ else if (upid < 0) {
+ type = PIDTYPE_PGID;
+ pid = find_get_pid(-upid);
+ } else if (upid == 0) {
+ type = PIDTYPE_PGID;
+ pid = get_pid(task_pgrp(current));
+ } else /* upid > 0 */ {
+ type = PIDTYPE_PID;
+ pid = find_get_pid(upid);
+ }
+
+ ret = do_wait(type, pid, options | WEXITED, NULL, NULL, NULL);
+ put_pid(pid);
+
+ /* avoid REGPARM breakage on x86: */
+ asmlinkage_protect(4, ret, upid, options);
+ return ret;
+}
Given that you blinded copied this from sys_wait4(), you may want to at least try to make sys_wait4() to re-use the new function.
Also, your patch does not apply to latest Linus tree. Which one was the base?This is the last commit before mine in my git repo: