Re: [PATCH] mm: PageLRU can be non-atomic bit operation
From: Andi Kleen
Date: Tue Apr 24 2007 - 10:23:22 EST
> Why would you need any kind of lock when just changing a single bit,
> if it didn't affect other bits of the same word? Just as you don't
> need a lock when simply assigning a word, setting a bit to 0 or 1
> is simple in itself (it doesn't matter if it was 0 or 1 before).
>
> > But, I think that concurrent bit operation on different bits
> > is just like OR operation , so lock prefix is not needed.
>
> I firmly believe that it is; but I'm not a processor expert.
Think of the CPU cache like the page cache. The VFS cannot change
anything directly on disk - it always has to read a page (or block);
change it there even if it's only a single bit and back.
Now imagine multiple independent kernels to that disk doing this in parallel
without any locking. You will lose data on simple changes because
of data races.
The CPU is similar. The memory is the disk; the protocol talking to
the memory only knows cache lines. There are multiple CPUs talking
to that memory. The CPUs cannot change anything
without reading a full cache line first and then later writing it back.
There is "just do OR operation" when talking to memory, just like
disks don't have such a operation; just read and write.
[there are some special cases with uncached mappings, but they are not
relevant here]
With lock the CPU ensures the read-modify-write cycle is atomic,
without it doesn't.
The CPU also guarantees you that multiple writes don't get lost
even when they hit the same cache line (otherwise you would need lock
for anything in the same cache line), but it doesn't guarantee that
for a word. The details for that vary by architecture; on x86 it is
any memory access, on others it can be only guaranteed for long stores.
The exact rules for all this can be quite complicated (and also vary
by architecture), this is a simplification but should be enough as a rough
guide.
Hope this helps,
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/