Re: clone() <-> getpid() bug in 2.6?

From: Russell Leighton
Date: Sun Jun 06 2004 - 09:26:45 EST



I have read the discussion on this issue and I wanted to get confirmation of my understanding...

Issue:
It seems glibc is caching getpid() which is wrong and breaks programs like mine.

You are using an older version of glibc than I and that is why you could run the test program.

Given the above, that means that:
Upgrading my kernel on the FedoraCore2 system won't help because the bug is in glibc

Assuming that this is a "new" feature of glibc, any others upgrading would then starting seeing this bug

Linus Torvalds wrote:

On Sat, 5 Jun 2004, Russell Leighton wrote:


I have a test program (see attached) that shows what looks like a bug in 2.6.5-1.358 (FedoraCore2)...and breaks my program :(

In summary, I am doing:

clone(run_thread, stack + sizeof(stack) -1,
CLONE_FS|CLONE_FILES|CLONE_VM|SIGCHLD, NULL))

According to the man page the child process should have its own pid as returned by getpid()...much like fork().

In 2.6 the child receives the parent's pid from getpid(), while 2.4 works as documented:

In 2.4 the test program does:
parent pid: 26647
clone returned pid: 26648
thread reported pid: 26648

In 2.6 the test program does:
parent pid: 16665
thread reported pid: 16665
clone returned pid: 16666



Hmm.. The above is the correct behaviour if you use CLONE_THREAD ("getpid()" will then return the _thread_ ID), but it shouldn't happen without that. And clearly you don't have it set.

And indeed, it doesn't happen for me on my system:

parent pid: 13552
thread reported pid: 13553
clone returned pid: 13553

so I wonder if either the Fedora libc always adds that CLONE_THREAD thing
to the clone() calls, or whether the FC2 kernel is buggy.

Arjan?

Linus





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/