Re: SMP read() stopping at memory page boundaries

From: Timo Sirainen
Date: Wed Jun 20 2007 - 11:48:23 EST


On Wed, 2007-06-20 at 17:52 +0300, Timo Sirainen wrote:
> Sometimes read() returns only 4096 bytes. I'm locking the file, so I
> don't think this should ever happen, right?

Sorry, the problem was with file truncating so there's no bug with
locking. I didn't notice it first because it happened to work with
FreeBSD and Solaris.

Without locking there's a problem that I'd want to avoid, but I guess
it's not necessarily a bug (although it doesn't seem to happen with
FreeBSD or Solaris):

Process 1:

- create "foo2"
- rename() to "foo"
- lock
- write(4096 + 16 bytes)
- [fsync() here doesn't change anything]
- pwrite("1111" to offset=4096-4)
- unlock

Process 2:

- open("foo")
- read(8192 bytes)

read() sometimes returns 4096 bytes but with the "1111" already included
in the data. Is there a way to avoid this without locking the file while
reading? The "1111" tries to act as a kind of a lock.

Again attached a test program, which should really do what I
intended. :)
/*
gcc concurrency.c -o concurrency -Wall

start two both a reader and a writer:

./concurrency
./concurrency 1
*/
#define _XOPEN_SOURCE 500
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <assert.h>
#include <sys/file.h>

#define MAX_PAGESIZE 8192

int main(int argc, char *argv[])
{
char buf[MAX_PAGESIZE*2], ones[4] = { 1, 1, 1, 1 };
int fd, ret, pagesize;

memset(buf, 0, sizeof(buf));

pagesize = getpagesize();
assert(pagesize <= MAX_PAGESIZE);

buf[pagesize] = 'h';
if (argc == 1) {
printf("writing, page size = %d\n", pagesize);
for (;;) {
fd = open("foo2", O_RDWR | O_CREAT | O_TRUNC, 0600);
if (fd == -1) {
perror("open()");
return 1;
}
if (rename("foo2", "foo") < 0)
perror("rename()");
usleep(rand() % 1000);

if (flock(fd, LOCK_EX) < 0)
perror("flock()");
pwrite(fd, buf, pagesize+16, 0);
//usleep(rand() % 1000);
//fdatasync(fd);
pwrite(fd, ones, 4, pagesize-4);
if (flock(fd, LOCK_UN) < 0)
perror("flock()");
usleep(rand() % 1000);
close(fd);
}
} else {
printf("reading, page size = %d\n", pagesize);
for (;; close(fd)) {
fd = open("foo", O_RDWR, 0600);
if (fd == -1) {
perror("open()");
return 1;
}

ret = pread(fd, buf, sizeof(buf), 0);
if (ret < pagesize) {
if (ret > 0)
printf("less than a page: %d\n", ret);
continue;
}

if (memcmp(buf + pagesize - 4, ones, 4) == 0) {
if (ret == pagesize) {
printf("page size cut\n");
} else if (buf[pagesize] != 'h') {
printf("broken data\n");
}
}
}
}
return 0;
}

Attachment: signature.asc
Description: This is a digitally signed message part