Re: Lockless file reading

From: Timo Sirainen
Date: Thu Aug 28 2003 - 08:29:48 EST


On Thursday, Aug 28, 2003, at 15:01 Europe/Helsinki, Jamie Lokier wrote:

This is what happens: the writing CPU writes "1", "2" in order. The
reading CPU reads bytes 1, 3 before the writes are
observed and bytes 0, 2 after. The CPU can do this.

The kernel doesn't prevent this, because it doesn't hold any exclusive
lock between the writer and reader during the data transfers.
Furthermore the kernel transfers a byte at a time, on some
architecture (including x86), if any buffer is not aligned.

Thanks, this was the kind of information I wanted to know. I assumed there were exclusive writer locks and such.

I think I've finally decided what to do. Mostly due to realization that the most problematic part was actually not a problem at all since it required locking anyway. :) Here's the summary and then I'll shut up here:

This was about index files for mailboxes. I had already written support for maildir, but mbox support was where I got stuck. I needed four kind of file updates:

1) File offset pointers (linked list). These are set only once, never modified. Originally I used lowest bit to specify that the offset was "used". I first wrote them without the bit, fsync, then the bit. Apparently this isn't enough then, so I'll change all 4 bytes to contain the used-bit. This gives 28bit number, but I can deal with 8 byte alignments so maximum file size is 2GB which is enough.

2) mbox message offsets. These change when messages are deleted or sometimes with flag changes. But the mbox file has to be locked for reading anyway if I want to use these offsets, so there's no way someone is updating them at the same time.

3) Index summary. Total number of message count, unread message count, etc. There's not many fields here, so I could try different kinds of ideas here.. One that at least works is to keep them in separate file and update it with rename().

4) Flag fields. There aren't mutually exclusive flag changes, so it doesn't matter if I read some of the changes partially.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/