Re: [RFC PATCH v5 1/2] hazptr: Implement Hazard Pointers

From: Paul E. McKenney

Date: Fri Jun 26 2026 - 19:06:36 EST

[ Trimming CC. ]

On Mon, Feb 23, 2026 at 03:44:17PM -0500, Mathieu Desnoyers wrote:
> This API provides existence guarantees of objects through Hazard
> Pointers [1] (hazptr).
>
> Its main benefit over RCU is that it allows fast reclaim of
> HP-protected pointers without needing to wait for a grace period.
>
> This implementation has 4 statically allocated hazard pointer slots per
> cpu for the fast path, and relies on a on-stack backup slot allocated by
> the hazard pointer user as fallback in case no per-cpu slot is
> available.
>
> It integrates with the scheduler to migrate per-CPU slots to the backup
> slot on context switch. This ensures that the per-CPU slots won't be
> used by blocked or preempted tasks holding on hazard pointers for a long
> time.
>
> References:
>
> [1]: M. M. Michael, "Hazard pointers: safe memory reclamation for
> lock-free objects," in IEEE Transactions on Parallel and
> Distributed Systems, vol. 15, no. 6, pp. 491-504, June 2004

I (finally) put together a stress test for this, which is the following
commits on -rcu:

bfb239e036d616 torture: Add a hazptrtorture.c torture test
adda38c3cffcd6 hazptrtorture: Add testing of on-stack hazptr_ctx structures
311e199e482873 hazptrtorture: Add microsecond-scale sleep in readers
21ccb41f79a45b hazptrtorture: Enable system-independent CPU overcommit
82e8c141265c81 torture: Add a stutter_will_wait() function
9ad8169bd3a367 hazptrtorture: Use mnemonic local variables for context information
cd541d5eb7347f hazptrtorture: Split hazptr_torture_reader_tail() from hazptr_torture_reader()
0de8bda6b00051 hazptrtorture: Add kthread to release deferred hazard pointers
3deefef00acb22 hazptrtorture: Defer release of hazard pointers
65b2cf3c6ea34e hazptrtorture: Add irq_acquire to acquire hazptr from irq
98d334c04b58fe hazptrtorture: Use task_state_to_char() for task-state reporting
1e89aae50a3fde hazptrtorture: Pass hazptr_pending to hazptr_torture_reader_tail()
4b1a177190e2e1 hazptrtorture: Add the ability to disable the writer kthread
2c966f58358329 hazptrtorture: Add irq_release to release hazptr from irq
16258269b41ed3 hazptrtorture: Accumulate operation statistics
c090e2659abcb3 doc: Add hazptrtorture module parameters

Initial runs suggest that hazard pointers cannot be acquired by one
task and released by another. Is that expected behavior, or is my test
improperly passing the hazard pointers? If the latter, we should of
course document the proper hazard-pointer-passing mechanism.

Thanx, Paul

> Link: https://lpc.events/event/19/contributions/2082/
> Link: https://lore.kernel.org/lkml/j3scdl5iymjlxavomgc6u5ndg3svhab6ga23dr36o4f5mt333w@7xslvq6b6hmv/
> Link: https://lpc.events/event/18/contributions/1731/
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> Cc: Nicholas Piggin <npiggin@xxxxxxxxx>
> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Boqun Feng <boqun@xxxxxxxxxx>
> Cc: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> Cc: John Stultz <jstultz@xxxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> Cc: Uladzislau Rezki <urezki@xxxxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> Cc: Lai Jiangshan <jiangshanlai@xxxxxxxxx>
> Cc: Zqiang <qiang.zhang1211@xxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Waiman Long <longman@xxxxxxxxxx>
> Cc: Mark Rutland <mark.rutland@xxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> Cc: maged.michael@xxxxxxxxx
> Cc: Mateusz Guzik <mjguzik@xxxxxxxxx>
> Cc: Jonas Oberhauser <jonas.oberhauser@xxxxxxxxxxxxxxx>
> Cc: rcu@xxxxxxxxxxxxxxx
> Cc: linux-mm@xxxxxxxxx
> Cc: lkmm@xxxxxxxxxxxxxxx
> ---
> Changes since v4:
> - Fold scheduler integration.
> - Actually set ctx slot to backup slot on context switch.
> - Remove CONFIG_PREEMPT_HAZPTR config option.
> - Use per-cpu ctx pointers for context switch slot tracking
> rather than per-task lists. This accelerates the hazptr
> acquire/release fast-path.
> - Guarantee scan forward progress with two-lists scheme.
> - Reimplement the hazptr acquire with a temporary wildcard
> to eliminate a dependency on the addr_p load, likely to
> cause a pipeline stall due to the needed memory barrier.
> This simplifies the algorithm, removes the need for pointer
> re-load + comparison, and is expected to be faster on some
> architectures.
> - Reduce number of percpu slots to 4, introduce a hazptr_slot_item
> struct to contain both the slot and ctx pointers. Reducing number
> of slots to 4 makes sure all the slot and ctx pointers fit in a
> single cache line.
> - Rebased on v7.0-rc1.
>
> Changes since v3:
> - Rename hazptr_retire to hazptr_release.
> - Remove domains.
> - Introduce "backup_slot" within hazptr context structure (on stack)
> to handle slot overflow.
> - Rename hazptr_try_protect to hazptr_acquire.
> - Preallocate 8 per-CPU slots, and rely on caller-provided backup
> slots (typically on stack) for out-of-slots situations.
>
> Changes since v2:
> - Address Peter Zijlstra's comments.
> - Address Paul E. McKenney's comments.
>
> Changes since v0:
> - Remove slot variable from hp_dereference_allocate().
> ---
> include/linux/hazptr.h | 197 +++++++++++++++++++++++++++++++++
> init/main.c | 2 +
> kernel/Makefile | 2 +-
> kernel/hazptr.c | 242 +++++++++++++++++++++++++++++++++++++++++
> kernel/sched/core.c | 2 +
> 5 files changed, 444 insertions(+), 1 deletion(-)
> create mode 100644 include/linux/hazptr.h
> create mode 100644 kernel/hazptr.c
>
> diff --git a/include/linux/hazptr.h b/include/linux/hazptr.h
> new file mode 100644
> index 000000000000..461f481a480b
> --- /dev/null
> +++ b/include/linux/hazptr.h
> @@ -0,0 +1,197 @@
> +// SPDX-FileCopyrightText: 2024 Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> +//
> +// SPDX-License-Identifier: LGPL-2.1-or-later
> +
> +#ifndef _LINUX_HAZPTR_H
> +#define _LINUX_HAZPTR_H
> +
> +/*
> + * hazptr: Hazard Pointers
> + *
> + * This API provides existence guarantees of objects through hazard
> + * pointers.
> + *
> + * Its main benefit over RCU is that it allows fast reclaim of
> + * HP-protected pointers without needing to wait for a grace period.
> + *
> + * References:
> + *
> + * [1]: M. M. Michael, "Hazard pointers: safe memory reclamation for
> + * lock-free objects," in IEEE Transactions on Parallel and
> + * Distributed Systems, vol. 15, no. 6, pp. 491-504, June 2004
> + */
> +
> +#include <linux/percpu.h>
> +#include <linux/types.h>
> +#include <linux/cleanup.h>
> +#include <linux/sched.h>
> +
> +/* 4 slots (each sizeof(hazptr_slot_item)) fit in a single 64-byte cache line. */
> +#define NR_HAZPTR_PERCPU_SLOTS 4
> +#define HAZPTR_WILDCARD ((void *) 0x1UL)
> +
> +/*
> + * Hazard pointer slot.
> + */
> +struct hazptr_slot {
> + void *addr;
> +};
> +
> +struct hazptr_overflow_list;
> +
> +struct hazptr_backup_slot {
> + struct hlist_node overflow_node;
> + struct hazptr_slot slot;
> + /* Overflow list where the backup slot is added. */
> + struct hazptr_overflow_list *overflow_list;
> +};
> +
> +struct hazptr_ctx {
> + struct hazptr_slot *slot;
> + /* Backup slot in case all per-CPU slots are used. */
> + struct hazptr_backup_slot backup_slot;
> + struct hlist_node preempt_node;
> +};
> +
> +struct hazptr_slot_ctx {
> + struct hazptr_ctx *ctx;
> +};
> +
> +struct hazptr_slot_item {
> + struct hazptr_slot slot;
> + struct hazptr_slot_ctx ctx;
> +};
> +
> +struct hazptr_percpu_slots {
> + struct hazptr_slot_item items[NR_HAZPTR_PERCPU_SLOTS];
> +} ____cacheline_aligned;
> +
> +DECLARE_PER_CPU(struct hazptr_percpu_slots, hazptr_percpu_slots);
> +
> +void *__hazptr_acquire(struct hazptr_ctx *ctx, void * const * addr_p);
> +
> +/*
> + * hazptr_synchronize: Wait until @addr is released from all slots.
> + *
> + * Wait to observe that each slot contains a value that differs from
> + * @addr before returning.
> + * Should be called from preemptible context.
> + */
> +void hazptr_synchronize(void *addr);
> +
> +/*
> + * hazptr_chain_backup_slot: Chain backup slot into overflow list.
> + *
> + * Set backup slot address to @addr, and chain it into the overflow
> + * list.
> + */
> +struct hazptr_slot *hazptr_chain_backup_slot(struct hazptr_ctx *ctx);
> +
> +/*
> + * hazptr_unchain_backup_slot: Unchain backup slot from overflow list.
> + */
> +void hazptr_unchain_backup_slot(struct hazptr_ctx *ctx);
> +
> +static inline
> +bool hazptr_slot_is_backup(struct hazptr_ctx *ctx, struct hazptr_slot *slot)
> +{
> + return slot == &ctx->backup_slot.slot;
> +}
> +
> +static inline
> +void hazptr_note_context_switch(void)
> +{
> + struct hazptr_percpu_slots *percpu_slots = this_cpu_ptr(&hazptr_percpu_slots);
> + unsigned int idx;
> +
> + for (idx = 0; idx < NR_HAZPTR_PERCPU_SLOTS; idx++) {
> + struct hazptr_slot_item *item = &percpu_slots->items[idx];
> + struct hazptr_slot *slot = &item->slot, *backup_slot;
> + struct hazptr_ctx *ctx;
> +
> + if (!slot->addr)
> + continue;
> + ctx = item->ctx.ctx;
> + backup_slot = hazptr_chain_backup_slot(ctx);
> + /*
> + * Move hazard pointer from the per-CPU slot to the
> + * backup slot. This requires hazard pointer
> + * synchronize to iterate on per-CPU slots with
> + * load-acquire before iterating on the overflow list.
> + */
> + WRITE_ONCE(backup_slot->addr, slot->addr);
> + /*
> + * store-release orders store to backup slot addr before
> + * store to per-CPU slot addr.
> + */
> + smp_store_release(&slot->addr, NULL);
> + /* Use the backup slot for context. */
> + ctx->slot = backup_slot;
> + }
> +}
> +
> +/*
> + * hazptr_acquire: Load pointer at address and protect with hazard pointer.
> + *
> + * Load @addr_p, and protect the loaded pointer with hazard pointer.
> + * When using hazptr_acquire from interrupt handlers, the acquired slots
> + * need to be released before returning from the interrupt handler.
> + *
> + * Returns a non-NULL protected address if the loaded pointer is non-NULL.
> + * Returns NULL if the loaded pointer is NULL.
> + *
> + * On success the protected hazptr slot is stored in @ctx->slot.
> + */
> +static inline
> +void *hazptr_acquire(struct hazptr_ctx *ctx, void * const *addr_p)
> +{
> + struct hazptr_percpu_slots *percpu_slots;
> + struct hazptr_slot_item *slot_item;
> + struct hazptr_slot *slot;
> + void *addr;
> +
> + guard(preempt)();
> + percpu_slots = this_cpu_ptr(&hazptr_percpu_slots);
> + slot_item = &percpu_slots->items[0];
> + slot = &slot_item->slot;
> + if (unlikely(slot->addr))
> + return __hazptr_acquire(ctx, addr_p);
> + WRITE_ONCE(slot->addr, HAZPTR_WILDCARD); /* Store B */
> +
> + /* Memory ordering: Store B before Load A. */
> + smp_mb();
> +
> + /*
> + * Load @addr_p after storing wildcard to the hazard pointer slot.
> + */
> + addr = READ_ONCE(*addr_p); /* Load A */
> +
> + /*
> + * We don't care about ordering of Store C. It will simply
> + * replace the wildcard by a more specific address. If addr is
> + * NULL, we simply store NULL into the slot.
> + */
> + WRITE_ONCE(slot->addr, addr); /* Store C */
> + slot_item->ctx.ctx = ctx;
> + ctx->slot = slot;
> + return addr;
> +}
> +
> +/* Release the protected hazard pointer from @slot. */
> +static inline
> +void hazptr_release(struct hazptr_ctx *ctx, void *addr)
> +{
> + struct hazptr_slot *slot;
> +
> + if (!addr)
> + return;
> + guard(preempt)();
> + slot = ctx->slot;
> + smp_store_release(&slot->addr, NULL);
> + if (unlikely(hazptr_slot_is_backup(ctx, slot)))
> + hazptr_unchain_backup_slot(ctx);
> +}
> +
> +void hazptr_init(void);
> +
> +#endif /* _LINUX_HAZPTR_H */
> diff --git a/init/main.c b/init/main.c
> index 1cb395dd94e4..b66017629935 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -105,6 +105,7 @@
> #include <linux/ptdump.h>
> #include <linux/time_namespace.h>
> #include <linux/unaligned.h>
> +#include <linux/hazptr.h>
> #include <net/net_namespace.h>
>
> #include <asm/io.h>
> @@ -1101,6 +1102,7 @@ void start_kernel(void)
> workqueue_init_early();
>
> rcu_init();
> + hazptr_init();
> kvfree_rcu_init();
>
> /* Trace events are available after this */
> diff --git a/kernel/Makefile b/kernel/Makefile
> index 6785982013dc..b7cef6e23038 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -7,7 +7,7 @@ obj-y = fork.o exec_domain.o panic.o \
> cpu.o exit.o softirq.o resource.o \
> sysctl.o capability.o ptrace.o user.o \
> signal.o sys.o umh.o workqueue.o pid.o task_work.o \
> - extable.o params.o \
> + extable.o params.o hazptr.o \
> kthread.o sys_ni.o nsproxy.o nstree.o nscommon.o \
> notifier.o ksysfs.o cred.o reboot.o \
> async.o range.o smpboot.o ucount.o regset.o ksyms_common.o
> diff --git a/kernel/hazptr.c b/kernel/hazptr.c
> new file mode 100644
> index 000000000000..a63ac681cb85
> --- /dev/null
> +++ b/kernel/hazptr.c
> @@ -0,0 +1,242 @@
> +// SPDX-FileCopyrightText: 2024 Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> +//
> +// SPDX-License-Identifier: LGPL-2.1-or-later
> +
> +/*
> + * hazptr: Hazard Pointers
> + */
> +
> +#include <linux/hazptr.h>
> +#include <linux/percpu.h>
> +#include <linux/spinlock.h>
> +#include <linux/mutex.h>
> +#include <linux/list.h>
> +#include <linux/export.h>
> +
> +struct hazptr_overflow_list {
> + raw_spinlock_t lock; /* Lock protecting overflow list and list generation. */
> + struct hlist_head head; /* Overflow list head. */
> + uint64_t gen; /* Overflow list generation. */
> +};
> +
> +/*
> + * Flip between two lists to guarantee list scan forward progress even
> + * with frequent generation counter increments. The list additions are
> + * always done on a different list than the one used for scan. The scan
> + * successively iterates on both lists. Therefore, only list removals
> + * can cause the iteration to retry, and the number of removals is
> + * limited to the number of list elements.
> + */
> +struct hazptr_overflow_list_flip {
> + struct mutex lock; /* Mutex protecting add_idx from concurrent updates. */
> + unsigned int add_idx; /* Index of current flip-list to add to. */
> + struct hazptr_overflow_list array[2];
> +};
> +
> +static DEFINE_PER_CPU(struct hazptr_overflow_list_flip, percpu_overflow_list_flip);
> +
> +DEFINE_PER_CPU(struct hazptr_percpu_slots, hazptr_percpu_slots);
> +EXPORT_PER_CPU_SYMBOL_GPL(hazptr_percpu_slots);
> +
> +static
> +struct hazptr_slot *hazptr_get_free_percpu_slot(struct hazptr_ctx *ctx)
> +{
> + struct hazptr_percpu_slots *percpu_slots = this_cpu_ptr(&hazptr_percpu_slots);
> + unsigned int idx;
> +
> + for (idx = 0; idx < NR_HAZPTR_PERCPU_SLOTS; idx++) {
> + struct hazptr_slot_item *item = &percpu_slots->items[idx];
> + struct hazptr_slot *slot = &item->slot;
> +
> + if (!slot->addr) {
> + item->ctx.ctx = ctx;
> + return slot;
> + }
> + }
> + /* All slots are in use. */
> + return NULL;
> +}
> +
> +/*
> + * Hazard pointer acquire slow path.
> + * Called with preemption disabled.
> + */
> +void *__hazptr_acquire(struct hazptr_ctx *ctx, void * const *addr_p)
> +{
> + struct hazptr_slot *slot = hazptr_get_free_percpu_slot(ctx);
> + void *addr;
> +
> + /*
> + * If all the per-CPU slots are already in use, fallback
> + * to the backup slot.
> + */
> + if (unlikely(!slot))
> + slot = hazptr_chain_backup_slot(ctx);
> + WRITE_ONCE(slot->addr, HAZPTR_WILDCARD); /* Store B */
> +
> + /* Memory ordering: Store B before Load A. */
> + smp_mb();
> +
> + /*
> + * Load @addr_p after storing wildcard to the hazard pointer slot.
> + */
> + addr = READ_ONCE(*addr_p); /* Load A */
> +
> + /*
> + * We don't care about ordering of Store C. It will simply
> + * replace the wildcard by a more specific address. If addr is
> + * NULL, we simply store NULL into the slot.
> + */
> + WRITE_ONCE(slot->addr, addr); /* Store C */
> + ctx->slot = slot;
> + if (!addr && hazptr_slot_is_backup(ctx, slot))
> + hazptr_unchain_backup_slot(ctx);
> + return addr;
> +}
> +EXPORT_SYMBOL_GPL(__hazptr_acquire);
> +
> +/*
> + * Perform piecewise iteration on overflow list waiting until "addr" is
> + * not present. Raw spinlock is released and taken between each list
> + * item and busy loop iteration. The overflow list generation is checked
> + * each time the lock is taken to validate that the list has not changed
> + * before resuming iteration or busy wait. If the generation has
> + * changed, retry the entire list traversal.
> + */
> +static
> +void hazptr_synchronize_overflow_list(struct hazptr_overflow_list *overflow_list, void *addr)
> +{
> + struct hazptr_backup_slot *backup_slot;
> + uint64_t snapshot_gen;
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(&overflow_list->lock, flags);
> +retry:
> + snapshot_gen = overflow_list->gen;
> + hlist_for_each_entry(backup_slot, &overflow_list->head, overflow_node) {
> + /* Busy-wait if node is found. */
> + for (;;) {
> + void *load_addr = smp_load_acquire(&backup_slot->slot.addr); /* Load B */
> +
> + if (load_addr != addr && load_addr != HAZPTR_WILDCARD)
> + break;
> + raw_spin_unlock_irqrestore(&overflow_list->lock, flags);
> + cpu_relax();
> + raw_spin_lock_irqsave(&overflow_list->lock, flags);
> + if (overflow_list->gen != snapshot_gen)
> + goto retry;
> + }
> + raw_spin_unlock_irqrestore(&overflow_list->lock, flags);
> + /*
> + * Release raw spinlock, validate generation after
> + * re-acquiring the lock.
> + */
> + raw_spin_lock_irqsave(&overflow_list->lock, flags);
> + if (overflow_list->gen != snapshot_gen)
> + goto retry;
> + }
> + raw_spin_unlock_irqrestore(&overflow_list->lock, flags);
> +}
> +
> +static
> +void hazptr_synchronize_cpu_slots(int cpu, void *addr)
> +{
> + struct hazptr_percpu_slots *percpu_slots = per_cpu_ptr(&hazptr_percpu_slots, cpu);
> + unsigned int idx;
> +
> + for (idx = 0; idx < NR_HAZPTR_PERCPU_SLOTS; idx++) {
> + struct hazptr_slot_item *item = &percpu_slots->items[idx];
> +
> + /* Busy-wait if node is found. */
> + smp_cond_load_acquire(&item->slot.addr, VAL != addr && VAL != HAZPTR_WILDCARD); /* Load B */
> + }
> +}
> +
> +/*
> + * hazptr_synchronize: Wait until @addr is released from all slots.
> + *
> + * Wait to observe that each slot contains a value that differs from
> + * @addr before returning.
> + * Should be called from preemptible context.
> + */
> +void hazptr_synchronize(void *addr)
> +{
> + int cpu;
> +
> + /*
> + * Busy-wait should only be done from preemptible context.
> + */
> + lockdep_assert_preemption_enabled();
> +
> + /*
> + * Store A precedes hazptr_scan(): it unpublishes addr (sets it to
> + * NULL or to a different value), and thus hides it from hazard
> + * pointer readers.
> + */
> + if (!addr)
> + return;
> + /* Memory ordering: Store A before Load B. */
> + smp_mb();
> + /* Scan all CPUs slots. */
> + for_each_possible_cpu(cpu) {
> + struct hazptr_overflow_list_flip *overflow_list_flip = per_cpu_ptr(&percpu_overflow_list_flip, cpu);
> + unsigned int scan_idx;
> +
> + /* Scan CPU slots. */
> + hazptr_synchronize_cpu_slots(cpu, addr);
> +
> + /*
> + * Scan backup slots in percpu overflow lists.
> + * Forward progress is guaranteed by scanning one list
> + * while new elements are added into the other list.
> + */
> + guard(mutex)(&overflow_list_flip->lock);
> + scan_idx = overflow_list_flip->add_idx ^ 1;
> + hazptr_synchronize_overflow_list(&overflow_list_flip->array[scan_idx], addr);
> + /* Flip current list. */
> + WRITE_ONCE(overflow_list_flip->add_idx, scan_idx);
> + hazptr_synchronize_overflow_list(&overflow_list_flip->array[scan_idx ^ 1], addr);
> + }
> +}
> +EXPORT_SYMBOL_GPL(hazptr_synchronize);
> +
> +struct hazptr_slot *hazptr_chain_backup_slot(struct hazptr_ctx *ctx)
> +{
> + struct hazptr_overflow_list_flip *overflow_list_flip = this_cpu_ptr(&percpu_overflow_list_flip);
> + unsigned int list_idx = READ_ONCE(overflow_list_flip->add_idx);
> + struct hazptr_overflow_list *overflow_list = &overflow_list_flip->array[list_idx];
> + struct hazptr_slot *slot = &ctx->backup_slot.slot;
> +
> + slot->addr = NULL;
> + guard(raw_spinlock_irqsave)(&overflow_list->lock);
> + overflow_list->gen++;
> + hlist_add_head(&ctx->backup_slot.overflow_node, &overflow_list->head);
> + ctx->backup_slot.overflow_list = overflow_list;
> + return slot;
> +}
> +EXPORT_SYMBOL_GPL(hazptr_chain_backup_slot);
> +
> +void hazptr_unchain_backup_slot(struct hazptr_ctx *ctx)
> +{
> + struct hazptr_overflow_list *overflow_list = ctx->backup_slot.overflow_list;
> +
> + guard(raw_spinlock_irqsave)(&overflow_list->lock);
> + overflow_list->gen++;
> + hlist_del(&ctx->backup_slot.overflow_node);
> +}
> +EXPORT_SYMBOL_GPL(hazptr_unchain_backup_slot);
> +
> +void __init hazptr_init(void)
> +{
> + int cpu;
> +
> + for_each_possible_cpu(cpu) {
> + struct hazptr_overflow_list_flip *overflow_list_flip = per_cpu_ptr(&percpu_overflow_list_flip, cpu);
> +
> + mutex_init(&overflow_list_flip->lock);
> + for (int i = 0; i < 2; i++) {
> + raw_spin_lock_init(&overflow_list_flip->array[i].lock);
> + INIT_HLIST_HEAD(&overflow_list_flip->array[i].head);
> + }
> + }
> +}
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 759777694c78..b3e10be20329 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -60,6 +60,7 @@
> #include <linux/profile.h>
> #include <linux/psi.h>
> #include <linux/rcuwait_api.h>
> +#include <linux/hazptr.h>
> #include <linux/rseq.h>
> #include <linux/sched/wake_q.h>
> #include <linux/scs.h>
> @@ -6790,6 +6791,7 @@ static void __sched notrace __schedule(int sched_mode)
> local_irq_disable();
> rcu_note_context_switch(preempt);
> migrate_disable_switch(rq, prev);
> + hazptr_note_context_switch();
>
> /*
> * Make sure that signal_pending_state()->signal_pending() below
> --
> 2.39.5
>