[PATCH] refcount_t: documentation for memory ordering differences

From: Elena Reshetova
Date: Mon Nov 06 2017 - 08:38:44 EST


Some functions from refcount_t API provide different
memory ordering guarantees that their atomic counterparts.
This adds a document outlining the differences and
showing examples.

Signed-off-by: Elena Reshetova <elena.reshetova@xxxxxxxxx>
---
Documentation/refcount-vs-atomic.txt | 234 +++++++++++++++++++++++++++++++++++
1 file changed, 234 insertions(+)
create mode 100644 Documentation/refcount-vs-atomic.txt

diff --git a/Documentation/refcount-vs-atomic.txt b/Documentation/refcount-vs-atomic.txt
new file mode 100644
index 0000000..09efd2b
--- /dev/null
+++ b/Documentation/refcount-vs-atomic.txt
@@ -0,0 +1,234 @@
+==================================
+refcount_t API compare to atomic_t
+==================================
+
+The goal of refcount_t API is to provide a minimal API for implementing
+object's reference counters. While a generic architecture-independent
+implementation from lib/refcount.c uses atomic operations underneath,
+there is a number of differences between some of the refcount_*() and
+atomic_*() functions with regards to the memory ordering guarantees.
+This document outlines the differences and provides respective examples
+in order to help maintainers validate their code against the change in
+these memory ordering guarantees.
+
+memory-barriers.txt and atomic_t.txt provide more background to the
+memory ordering in general and for atomic operations specifically.
+
+Summary of the differences
+==========================
+
+ 1) There is no difference between respective non-RMW ops, i.e.
+ refcount_set() & refcount_read() have exactly the same ordering
+ guarantees (meaning fully unordered) as atomic_set() and atomic_read().
+ 2) For the increment-based ops that return no value (namely
+ refcount_inc() & refcount_add()) memory ordering guarantees are
+ exactly the same (meaning fully unordered) as respective atomic
+ functions (atomic_inc() & atomic_add()).
+ 3) For the decrement-based ops that return no value (namely
+ refcount_dec()) memory ordering guarantees are slightly
+ stronger than respective atomic counterpart (atomic_dec()).
+ While atomic_dec() is fully unordered, refcount_dec() does
+ provide a RELEASE memory ordering guarantee (see next section).
+ 4) For the rest of increment-based RMW ops (refcount_inc_not_zero(),
+ refcount_add_not_zero()) the memory ordering guarantees are relaxed
+ compare to their atomic counterparts (atomic_inc_not_zero()).
+ Refcount variants provide no memory ordering guarantees apart from
+ control dependency on success, while atomics provide a full memory
+ ordering guarantees (see next section).
+ 5) The rest of decrement-based RMW ops (refcount_dec_and_test(),
+ refcount_sub_and_test(), refcount_dec_if_one(), refcount_dec_not_one())
+ provide only RELEASE memory ordering and control dependency on success
+ (see next section). The respective atomic counterparts
+ (atomic_dec_and_test(), atomic_sub_and_test()) provide full memory ordering.
+ 6) The lock-based RMW ops (refcount_dec_and_lock() &
+ refcount_dec_and_mutex_lock()) alway provide RELEASE memory ordering
+ and ACQUIRE memory ordering & control dependency on success
+ (see next section). The respective atomic counterparts
+ (atomic_dec_and_lock() & atomic_dec_and_mutex_lock())
+ provide full memory ordering.
+
+
+
+Details and examples
+====================
+
+Here we consider the cases 3)-6) that do present differences together
+with respective examples.
+
+case 3) - decrement-based RMW ops that return no value
+------------------------------------------------------
+
+Function changes:
+ atomic_dec() --> refcount_dec()
+
+Memory ordering guarantee changes:
+ fully unordered --> RELEASE ordering
+
+RELEASE ordering guarantees that prior loads and stores are
+completed before the operation. Implemented using smp_store_release().
+
+Examples:
+~~~~~~~~~
+
+For fully unordered operations stores to a, b and c can
+happen in any sequence:
+
+P0(int *a, int *b, int *c)
+ {
+ WRITE_ONCE(*a, 1);
+ WRITE_ONCE(*b, 1);
+ WRITE_ONCE(*c, 1);
+ }
+
+
+For a RELEASE ordered operation, read and write from/to @a
+is guaranteed to happen before store to @b. There are no
+guarantees on the order of store/read to/from @c:
+
+P0(int *a, int *b, int *c)
+ {
+ READ_ONCE(*a);
+ WRITE_ONCE(*a, 1);
+ smp_store_release(b, 1);
+ WRITE_ONCE(*c, 1);
+ READ_ONCE(*c);
+ }
+
+
+case 4) - increment-based RMW ops that return a value
+-----------------------------------------------------
+
+Function changes:
+ atomic_inc_not_zero() --> refcount_inc_not_zero()
+ no atomic counterpart --> refcount_add_not_zero()
+
+Memory ordering guarantees changes:
+ fully ordered --> control dependency on success for stores
+
+Control dependency on success guarantees that if a reference for an
+object was successfully obtained (reference counter increment or
+addition happened, functions returned true), then further stores are ordered
+against this operation. Control dependency on stores are not implemented
+using any explicit barriers, but we rely on CPU not to speculate on stores.
+
+*Note*: we really assume here that necessary ordering is provided as a result
+of obtaining pointer to the object!
+
+Examples:
+~~~~~~~~~
+
+For a fully ordered atomic operation smp_mb() barriers are inserted before
+and after the actual operation:
+
+P0(int *a, int *b, int *c)
+ {
+ WRITE_ONCE(*b, 2);
+ READ_ONCE(*c);
+ if ( ({ smp_mb(); ret = do_atomic_inc_not_zero(*a); smp_mb(); ret }) ) {
+ safely_perform_operation_on_object_protected_by_@a();
+ ...
+ }
+ WRITE_ONCE(*c, 2);
+ READ_ONCE(*b);
+ }
+
+These barriers guarantee that all prior loads and stores (@b and @c)
+are completed before the operation, as well as all later loads and
+stores (@b and @c) are completed after the operation.
+
+For a fully unordered refcount operation smp_mb() barriers are absent
+and only control dependency on stores is guaranteed:
+
+P0(int *a, int *b, int *c)
+ {
+ WRITE_ONCE(*b, 2);
+ READ_ONCE(*c);
+ if ( ({ ret = do_refcount_inc_not_zero(*a); ret }) ) {
+ perform_store_operation_on_object_protected_by_@a();
+ /* here we assume that necessary ordering is provided
+ * using other means, such as locks etc. */
+ ...
+ }
+ WRITE_ONCE(*c, 2);
+ READ_ONCE(*b);
+ }
+
+No guarantees on order of stores and loads to/from @b and @c.
+
+
+case 5) - decrement-based RMW ops that return a value
+-----------------------------------------------------
+
+Function changes:
+ atomic_dec_and_test() --> refcount_dec_and_test()
+ atomic_sub_and_test() --> refcount_sub_and_test()
+ no atomic counterpart --> refcount_dec_if_one()
+ atomic_add_unless(&var, -1, 1) --> refcount_dec_not_one(&var)
+
+Memory ordering guarantees changes:
+ fully ordered --> RELEASE ordering + control dependency on success for stores
+
+Note: atomic_add_unless() only provides full order on success.
+
+Examples:
+~~~~~~~~~
+
+For a fully ordered atomic operation smp_mb() barriers are inserted before
+and after the actual operation:
+
+P0(int *a, int *b, int *c)
+ {
+ WRITE_ONCE(*b, 2);
+ READ_ONCE(*c);
+ if ( ({ smp_mb(); ret = do_atomic_dec_and_test(*a); smp_mb(); ret }) ) {
+ safely_free_the_object_protected_by_@a();
+ ...
+ }
+ WRITE_ONCE(*c, 2);
+ READ_ONCE(*b);
+ }
+
+These barriers guarantee that all prior loads and stores (@b and @c)
+are completed before the operation, as well as all later loads and
+stores (@b and @c) are completed after the operation.
+
+
+P0(int *a, int *b, int *c)
+ {
+ WRITE_ONCE(*b, 2);
+ READ_ONCE(*c);
+ if ( ({ smp_store_release(*a); ret = do_refcount_dec_and_test(*a); ret }) ) {
+ safely_free_the_object_protected_by_@a();
+ /* here we know that this is 1 --> 0 transition
+ * and therefore we are the last user of this object
+ * so no concurrency issues are present */
+ ...
+ }
+ WRITE_ONCE(*c, 2);
+ READ_ONCE(*b);
+ }
+
+Here smp_store_release() guarantees that a store to @b and read
+from @c happens before the operation. However, there is no
+guarantee on the order of store to @c and read to @b following
+the if cause.
+
+
+case 6) - lock-based RMW
+------------------------
+
+Function changes:
+
+ atomic_dec_and_lock() --> refcount_dec_and_lock()
+ atomic_dec_and_mutex_lock() --> refcount_dec_and_mutex_lock()
+
+Memory ordering guarantees changes:
+ fully ordered --> RELEASE ordering always, and on success ACQUIRE
+ ordering & control dependency for stores
+
+
+ACQUIRE ordering guarantees that loads and stores issued after the ACQUIRE
+operation are completed after the operation. In this case implemented
+using spin_lock().
+
+
--
2.7.4