[PATCH 2/3] perf/core: Try to allocate task_ctx_data quickly
From: Namhyung Kim
Date: Wed Feb 11 2026 - 17:32:35 EST
The attach_global_ctx_data() has O(N^2) algorithm to allocate the
context data for each thread. This caused perfomance problems on large
systems with O(100k) threads.
Because kmalloc(GFP_KERNEL) can go sleep it cannot be called under the
RCU lock. So let's try with GFP_NOWAIT first so that it can proceed in
normal cases.
Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
---
kernel/events/core.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index b8498e9891e21c18..5b05a71edeb47955 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5489,6 +5489,13 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache)
cd = NULL;
}
if (!cd) {
+ /*
+ * Try to allocate context quickly before
+ * traversing the whole thread list again.
+ */
+ if (!attach_task_ctx_data(p, ctx_cache, true,
+ GFP_NOWAIT))
+ continue;
get_task_struct(p);
goto alloc;
}
--
2.53.0.273.g2a3d683680-goog