Re: [PATCH] nextfd(2)

From: KOSAKI Motohiro
Date: Wed Apr 04 2012 - 12:32:28 EST


(4/2/12 4:56 PM), H. Peter Anvin wrote:
On 04/02/2012 04:17 PM, KOSAKI Motohiro wrote:

Sorry for the long delay comment. I realized this thread now. I think
/proc no mount case is not good explanation for the worth of this patch. The problem
is, we can't use opendir() after fork() if an app has multi threads.

SUS clearly say so,
http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html

we can only call async-signal-safe functions after fork() when multi threads and
opendir() call malloc() internally.

As far as I know, OpenJDK has a such fork-readdir-exec code and it can
make deadlock
when spawnning a new process. Unfortunately Java language perfeter to
make a lot of threads rather than other language.

This patch can solve such multi threaded case.

offtopic, glibc malloc is a slightly clever. It reinitialize its
internal lock when fork by using thread_atfork() hook. It mean glibc malloc can be used after
fork() and the technique can avoid this issue. But, glibc malloc still has several
performance problem and many people prefer to use jemalloc or google malloc instead. Then,
they hit an old issue, bah.


OK, so what you're saying here is:

Linux doesn't actually have a problem unless:
1. You use the library implementation of opendir/readdir/closedir;
2. You use a nonstandard malloc for the platform which doesn't
correctly set up fork hooks (which I would consider a bug);

Right. but I'm argue "correctly set up" term because SUS/POSIX don't require it.
It is only a workaround of buggy userland in glibc. SUS still says you can't
use opendir and typical userland people don't want ignore SUS as far as possible.


You can deal with this in one of two ways:

2. Fix your malloc().
1. Use the low level open()/getdents()/close() functions instead of
opendir()/readdir()/closedir().

Ideally possible. but practically impossible. 2) people don't use a their
own malloc. they only uses open sources alternative malloc. And, I think
you have too narrowing concern. Even though malloc people adds a workaround,
the standard inhibit to use it and people may continue to use more dangerous
RLIM_NOFILE loop. 1) I haven't seen _practical_ userland software uses such
linux internal hacking. Almost all major software can run on multiple OSs.


and I've received a request that linux aim fdwalk() several times. Example,

It doesn't sound very hard to implement fdwalk() in terms of
open/getdents/close without using malloc; since the fdwalk() interface
lets you use the stack for storage. You can then implement closefrom()
in terms of fdwalk(). Something like this (untested):

int fdwalk(int (*func)(void *, int), void *cd)
{
char buf[4096]; /* ... could be less... */
const char *p, *q;
const struct linux_dirent *dp
int dfd, fd;
unsigned char c;
int rv = 0;
int sz;

dfd = open("/proc/self/fd", O_RDONLY|O_DIRECTORY|O_CLOEXEC);
if (dfd< 0)
return -1;

/*** XXX: may want to check for procfs magic here ***/

while ((sz = getdents(dfd, buf, sizeof buf))> 0) {
p = buf;

while (sz> offsetof(struct linux_dirent, d_name)) {
dp = (const struct linux_dirent *)p;

if (sz< dp->d_reclen)
break;

q = dp->d_name;
p += dp->d_reclen;
sz -= dp->d_reclen;

fd = 0;
while (q< p&& (c = *q++)) {
c -= '0';
if (c>= 10)
goto skip;
fd = fd*10 + c;
}

if (fd != dfd)
rv = func(cd, fd);
skip:
;
}
}

if (close(dfd))
return -1;

return rv;
}

It can. but more ugly. no?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/