[PATCH v2 21/22] mpool: add documentation

From: Nabeel M Mohamed
Date: Mon Oct 12 2020 - 12:29:46 EST


This adds locking hierarchy documentation for mpool and
updates ioctl-number.rst with mpool driver's ioctl code.

Co-developed-by: Greg Becker <gbecker@xxxxxxxxxx>
Signed-off-by: Greg Becker <gbecker@xxxxxxxxxx>
Co-developed-by: Pierre Labat <plabat@xxxxxxxxxx>
Signed-off-by: Pierre Labat <plabat@xxxxxxxxxx>
Co-developed-by: John Groves <jgroves@xxxxxxxxxx>
Signed-off-by: John Groves <jgroves@xxxxxxxxxx>
Signed-off-by: Nabeel M Mohamed <nmeeramohide@xxxxxxxxxx>
---
.../userspace-api/ioctl/ioctl-number.rst | 3 +-
drivers/mpool/mpool-locking.rst | 90 +++++++++++++++++++
2 files changed, 92 insertions(+), 1 deletion(-)
create mode 100644 drivers/mpool/mpool-locking.rst

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 2a198838fca9..1928606ff447 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -97,7 +97,8 @@ Code Seq# Include File Comments
'&' 00-07 drivers/firewire/nosy-user.h
'1' 00-1F linux/timepps.h PPS kit from Ulrich Windl
<ftp://ftp.de.kernel.org/pub/linux/daemons/ntp/PPS/>
-'2' 01-04 linux/i2o.h
+'2' 01-04 linux/i2o.h conflict!
+'2' 00-8F drivers/mpool/mpool_ioctl.h conflict!
'3' 00-0F drivers/s390/char/raw3270.h conflict!
'3' 00-1F linux/suspend_ioctls.h, conflict!
kernel/power/user.c
diff --git a/drivers/mpool/mpool-locking.rst b/drivers/mpool/mpool-locking.rst
new file mode 100644
index 000000000000..6a5da727f2fb
--- /dev/null
+++ b/drivers/mpool/mpool-locking.rst
@@ -0,0 +1,90 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+=============
+Mpool Locking
+=============
+
+Hierarchy
+---------
+::
+
+ mpool_s_lock
+ pmd_s_lock
+ eld_rwlock object layout r/w lock (per layout)
+ pds_oml_lock "open mlog" rbtree lock
+ mdi_slotvlock
+ mmi_uqlock unique ID generator lock
+ mmi_compactlock compaction lock (per MDC)
+ mmi_uc_lock uncommitted objects rbtree lock (per MDC)
+ mmi_co_lock committed objects rbtree lock (per MDC)
+ pds_pdvlock
+ pdi_rmlock[]
+ sda_dalock
+
+Nesting
+-------
+
+There are three nesting levels for mblocks, mlogs, and mpcore's own
+metadata containers (MDCs):
+
+1. PMD_OBJ_CLIENT for client mblocks and mlogs.
+2. PMD_MDC_NORMAL for MDC-1/255 and their underlying mlog pairs.
+3. PMD_MDC_ZERO for MDC-0 and its underlying mlog pair.
+
+A thread of execution may obtain at most one instance of a given lock-class
+at each nesting level, and must do so in the order specified above.
+
+The following helper functions determine the nesting level and use the
+appropriate _nested() primitive or lock pool::
+
+ pmd_obj_rdlock() and _rdunlock()
+ pmd_obj_wrlock() and _wrunlock()
+ pmd_mdc_rdlock() and _rdunlock()
+ pmd_mdc_wrlock() and _wrunlock()
+ pmd_mdc_lock() and _unlock()
+
+For additional information on the _nested() primitives, see
+https://www.kernel.org/doc/Documentation/locking/lockdep-design.txt.
+
+MDC Compaction Locking Patterns
+-------------------------------
+
+In addition to obeying the lock hierarchy and lock-class nesting levels, the
+following locking rules must also be followed for object layouts and all
+mpool properties stored in MDC-0 (e.g., the list of mpool drives pds_pdv[]).
+
+Object layouts (struct pmd_layout):
+
+- Readers must read-lock the layout using pmd_obj_rdlock().
+- Updaters must both write-lock the layout using pmd_obj_wrlock() and lock
+ the mmi_compactlock for the object's MDC using pmd_mdc_lock() before
+ first logging the update in that MDC and then updating the layout.
+
+Mpool properties stored in MDC-0:
+
+- Readers must read-lock the data structure(s) associated with the property.
+- Updaters must both write-lock the data structure(s) associated with the
+ property and lock the mmi_compactlock for MDC-0 using pmd_mdc_lock() before
+ first logging the update in MDC-0 and then updating the data structure(s).
+
+This locking pattern achieves the following:
+
+- For objects associated with a given MDC-0/255, layout readers can execute
+ concurrent with compacting that MDC, whereas layout updaters cannot.
+- For mpool properties stored in MDC-0, property readers can execute
+ concurrent with compacting MDC-0, whereas property updaters cannot.
+- To compact a given MDC-0/255, all in-memory and on-media state to be
+ written is frozen by simply locking the mmi_compactlock for that MDC
+ (because updates to the committed objects tree may take place only while
+ holding both both the compaction mutex and the mmi_co_lock write lock).
+
+Furthermore, taking the mmi_compactlock does not reduce concurrency for
+object or property updaters because these are inherently serialized by the
+requirement to synchronously append log records in the associated MDC.
+
+Object Layout Reference Counts
+------------------------------
+
+The reference counts for an object layout (eld_ref) are protected
+by mmi_co_lock or mmi_uc_lock of the object's MDC dependiing upon
+which tree it is in at the time of acquisition.
--
2.17.2