Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x

From: Samuel Thibault
Date: Thu Apr 27 2006 - 05:09:17 EST


Hi,

I'm getting a bit late in the discussion, but this is a bug that we've
here known for quite some time now. The easy fix for getting correct
concurrent writes was to add a pipe: instead of calling

my_parallel_program > log

just call

my_parallel_program | tee log

the pipe guarantees atomicity.

Another testcase was the following program:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
int n, num = 0, lus = 0;
char c;

if(argc >= 2) num = atoi(argv[1]) - 1;

while (num) {
if(fork() == 0) break;
num--;
}

for(;;) {
n = read(STDIN_FILENO, &c, 1);
if(n == 1) lus++;
else break;
}

printf("Proc %d read %d bytes\n", num, lus);
return 0;
}

Running it with no parameter runs fine:

$ ls -l foo
-rwxr-xr-x 1 samy samy 8174 2006-04-07 11:21 testlocale
$ ./lire < foo
Proc 0 read 8174 bytes

Running it with a given number of processes shows the non-atomicity of
offset update, even on UP:

$ ./lire 2 < foo
Proc 0 read 3903 bytes
Proc 1 read 8174 bytes

And it is getting worse as the number of processes increase. When we
discovered this, we checked such atomicity (i.e. each byte of the file
is read exactly by one process) on several systems:

- As shown, linux completely fails.
- OS X almost succeeds: on a dual SMP system, with 10 processes on a
900KB file, 68 bytes got read by more that one process.
- OSF succeeds: on a 8-way SMP system, with 10 processes on a 50MB
file, _no_ byte got read by more than one process.
- Solaris completely fails.
- AIX completely fails.

So I guess that people should just _not_ rely on such atomicity.

Regards,
Samuel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/