Re: [RFC] Synchronized Shared Pointers for the Linux kernel
From: Mathieu Desnoyers
Date: Fri Oct 11 2024 - 11:17:28 EST
On 2024-10-11 01:11, Boqun Feng wrote:
On Thu, Oct 10, 2024 at 03:16:25PM -0400, Mathieu Desnoyers wrote:
Hi,
I've created a new API (sharedptr.h) for the use-case of
providing existence object guarantees (e.g. for Rust)
when dereferencing pointers which can be concurrently updated.
I call this "Synchronized Shared Pointers".
This should be an elegant solution to Greg's refcount
existence use-case as well.
The current implementation can be found here:
https://github.com/compudj/linux-dev/commit/64c3756b88776fe534629c70f6a1d27fad27e9ba
Patch added inline below for feedback.
Thanks!
Mathieu
[...]
+ */
+static inline
+struct sharedptr sharedptr_copy_from_sync(const struct syncsharedptr *ssp)
+{
+ struct sharedptr_node *spn, *hp;
+ struct hazptr_slot *slot;
+ struct sharedptr sp;
+
+ preempt_disable();
Disabling preemption acts as an RCU read-side critical section, so I
guess the immediate question is why (or when) not use RCU ;-)
That's a very relevant question indeed! Why use hazard pointers rather
than RCU in this particular use-case ?
You are right that I could add a rcu_read_lock()/rcu_read_unlock()
around sharedptr_copy_from_sync(), and pair this with a call_rcu()
in node_release, which would effectively replace hazard pointers
by RCU.
Please keep in mind that the current implementation of this API
is minimalist. I mean to extend this, and this is where the benefits
of hazard pointers over RCU should become clearer:
1) With hazard pointers, AFAIU we can modify the sharedptr_delete
and syncsharedptr_delete to issue hazptr_scan() _before_
decrementing to 0, e.g.
unsigned int old, new;
WRITE_ONCE(sp->spn, NULL);
old = spn->refcount;
do {
new = old - 1;
if (!new)
hazptr_scan(&hazptr_domain_sharedptr, spn, NULL);
} while (!atomic_try_cmpxchg(&spn->refcount->refs, &old, new);
if (!new)
sharedptr_node_release(spn);
And therefore modify sharedptr_copy_from_sync to use a refcount_inc
rather than a refcount_inc_not_zero, because the reference count
can never be 0 while a hazard pointer to the object exists.
This modification would make hazard pointers act as if they
*are* holding a reference count on the object.
This improvement brings us to a more important benefit:
2) If we increase the number of available hazptr slots per cpu
(e.g. to 8 or more), then we can start using hazard pointers as
reference counter replacement (fast-path).
This will allow introducing a new type of sharedptr object, which
could be named "thread sharedptr", meant to hold a reference for a
relatively short period of time, on a single thread, where the thread
is still allowed to be preempted or block while holding the thread
sharedptr.
The per-cpu hazard pointer slots would point to per-thread sharedptr
structures, which would hold the protected hazard pointer slot.
When the application means to keep the reference for longer,
to store it into a data structure or pass it around to another
thread, then it should copy the thread-sharedptr to a normal
sharedptr, which will make sure a reference is taken on the
object.
The thread sharedptr is tied to the thread using it. Because the
hazard pointer context would be in a well-defined per-thread area
(rather than just on the stack), we can do the following when
scanning for hazard pointers: force the thread sharedptr to promote
to a reference counter increment on the object, thus allowing the
hazard pointer scan to progress. This allows freeing up the per-cpu
slot immediately.
If all per-cpu hazard pointer slots are used, the thread sharedptr
would automatically fall-back to reference counter.
We could even add a per-cpu timer which would track how old are each
per-CPU hazard pointer slots, and promote them to a reference counter
increment based on their age if needed.
This would effectively allow implementing a "per-thread shared pointer"
fast-path, which would scale better than just reference counters on large
multi-core systems.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com