Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion

From: Keith Owens (kaos@ocs.com.au)
Date: Wed Oct 10 2001 - 06:54:39 EST


On Wed, 10 Oct 2001 05:05:10 +0000 (UTC),
torvalds@transmeta.com (Linus Torvalds) wrote:
>Now, before people get all excited, what is this particular code
>actually _good_ for?
>
>Creating a lock-free list that allows insertion concurrently with lookup
>is _easy_.
>
>But what's the point? If you insert stuff, you eventually have to remove
>it. What goes up must come down. Insert-inane-quote-here.
>
>And THAT is the hard part. Doing lookup without locks ends up being
>pretty much worthless, because you need the locks for the removal
>anyway, at which point the whole thing looks pretty moot.

<pedantic>

It is possible to do completely lock free queue management, I have done
it (once). There are several major requirements :-

* Storage must never change its type (type stable storage). Once a
  block has been assigned as struct foo, it must always contain struct
  foo. This can be relaxed slightly as long as the data types have a
  common header format.

  The biggest problem this causes is that storage can never be released
  and unmapped. You never know if somebody is going to follow an old
  pointer to access storage that you freed. You must put the storage
  on a free list for struct foo instead of unmapping it, to maintain
  the type stable storage.

* Every piece of code that traverses the data tree for structure foo
  must take a copy of each struct foo into local storage. After taking
  a copy it must verify that its local copy is self consistent, via
  validity indicators in each struct. The validity indicators are
  typically generation numbers, whatever you use, the indicators must
  be atomically updated and atomically read, with suitable wmb and rmb
  barriers for machines with weak memory ordering.

* After following a pointer in your local copy of struct foo to access
  another data area, you must verify that the master version of your
  local copy has not been changed. If the master copy has changed then
  the pointer you just followed is no longer reliable. In almost every
  case you have to start the traversal from the beginning. It makes
  for very convoluted reader and updater code.

</pedantic>

The only reason I used lock free read and update was to hook into an
existing system that mandated sub second responses. Some of the new
operations that were performed during list traversal and update were
subject to unbounded delays that were completely outside my control. I
could not afford to lock the data structures during traversal because
any delay in the new operations would completely stall the existing
system, also the existing system had to be able to retrieve and delete
data for the new operations at any time.

I don't recommend doing lock free unless you have absolutely no
alternative. It requires convoluted code, it is difficult to debug, it
is expensive on memory bandwidth (forget zero copy) and tends to be
expensive on storage as well.

If you are interested in lock free work, take a look at
http://www.cs.pitt.edu/~moir/papers.html, particularly Practical
Implementations of Non-Blocking Synchronization Primitives. But
beware, that paper has an error which caused intermittent bugs that
took me 5 months to track down. Section 3.3, proc Copy, line 6 should
be

6: y := addr->data[i];

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Oct 15 2001 - 21:00:31 EST