Re: [PATCH 22/27] locking/lockdep: Reuse list entries that are no longer in use

From: Bart Van Assche
Date: Mon Dec 03 2018 - 13:17:06 EST


On Mon, 2018-12-03 at 18:32 +-0100, Peter Zijlstra wrote:
+AD4 On Mon, Dec 03, 2018 at 08:40:48AM -0800, Bart Van Assche wrote:
+AD4
+AD4 +AD4 +AD4 I think we can do this with a free bitmap and an array of 2 pending
+AD4 +AD4 +AD4 bitmaps and an index. Add newly freed entries to the pending bitmap
+AD4 +AD4 +AD4 indicated by the current index, when complete flip the index -- such
+AD4 +AD4 +AD4 that further new bits go to the other pending bitmap -- and call+AF8-rcu().
+AD4 +AD4 +AD4
+AD4 +AD4 +AD4 Then, on the call+AF8-rcu() callback, ie. after a GP has happened, OR our
+AD4 +AD4 +AD4 pending bitmap into the free bitmap, and when the other pending bitmap
+AD4 +AD4 +AD4 isn't empty, flip the index again and start it all again.
+AD4 +AD4 +AD4
+AD4 +AD4 +AD4 This ensures there is at least one full GP between setting a bit and it
+AD4 +AD4 +AD4 landing in the free mask.
+AD4 +AD4
+AD4 +AD4 Hi Peter,
+AD4 +AD4
+AD4 +AD4 How about the following alternative which requires only two bitmaps instead
+AD4 +AD4 of three:
+AD4 +AD4 - Maintain two bitmaps, one for the free entries and one for the entries
+AD4 +AD4 that are being freed.
+AD4 +AD4 - Protect all accesses to both bitmaps with the graph lock.
+AD4 +AD4 - zap+AF8-class() sets a bit in the +ACI-being freed+ACI bitmap for the entries that
+AD4 +AD4 should be freed after a GP.
+AD4 +AD4 - Instead of making free+AF8-zapped+AF8-classes() wait for a grace period by calling
+AD4 +AD4 synchronize+AF8-sched(), use call+AF8-rcu() and do the freeing work from inside the
+AD4 +AD4 RCU callback.
+AD4 +AD4 - From inside the RCU callback, set a bit in the +ACI-free+ACI bitmap for all entries
+AD4 +AD4 that have a bit set in the +ACI-being freed+ACI bitmap and clears the +ACI-being freed+ACI
+AD4 +AD4 bitmap.
+AD4
+AD4 What happens when another unreg happens while the rcu+AF8-call thing is
+AD4 still pending?

A new flag will have to keep track of whether or not an RCU callback has
already been scheduled via rcu+AF8-call() but not yet executed to avoid double
RCU call complaints. In other code a possible alternative would be to
allocate the RCU head data structure dynamically. However, I don't think
that alternative is appropriate inside the lockdep code - I don't want to
introduce a circular dependency between the lockdep code and the memory
allocator.

Bart.