[PATCH v3 1/4] sunrpc: route to a populated pool in svc_pool_for_cpu()
From: Jeff Layton
Date: Mon Jun 29 2026 - 13:53:00 EST
svc_set_num_threads() spreads the requested threads evenly across the
service's pools (base = nrservs / sv_nrpools). When a service runs
fewer threads than it has pools -- e.g. an nfsd configured with fewer
threads than the host has NUMA nodes while running in "pernode" or
"percpu" mode -- the trailing pools are left with no threads at all.
svc_xprt_enqueue() selects a pool from the CPU servicing the transport,
queues the transport on that pool's sp_xprts, and only wakes a thread
from the same pool. Each thread services exclusively its own pool, so a
transport that lands on a threadless pool is enqueued on sp_xprts and
never picked up: the connection hangs indefinitely.
Have svc_pool_for_cpu() skip pools that currently have no threads,
falling back to the next populated pool. This trades NUMA locality for
a guarantee that the work is actually serviced. sp_nrthreads is only
updated under the service mutex; the lockless read here is a best-effort
routing hint, so annotate it with data_race().
Fixes: 0f0257eaa5d2 ("svc: Move the xprt independent code to the svc_xprt.c file")
Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
---
net/sunrpc/svc.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index dd80a2eaaa74..82fb7faf563f 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -402,6 +402,7 @@ struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv)
struct svc_pool_map *m = &svc_pool_map;
int cpu = raw_smp_processor_id();
unsigned int pidx = 0;
+ unsigned int i;
if (serv->sv_nrpools <= 1)
return serv->sv_pools;
@@ -414,8 +415,31 @@ struct svc_pool *svc_pool_for_cpu(struct svc_serv *serv)
pidx = m->to_pool[cpu_to_node(cpu)];
break;
}
+ pidx %= serv->sv_nrpools;
+
+ /*
+ * Threads are spread evenly across the pools, but when there are
+ * fewer threads than pools some pools can end up with none. A
+ * transport enqueued on a threadless pool would never be picked
+ * up, since each thread only services its own pool. Fall back to
+ * the next populated pool, trading NUMA locality for a guarantee
+ * that the transport is serviced.
+ */
+ for (i = 0; i < serv->sv_nrpools; i++) {
+ struct svc_pool *pool = &serv->sv_pools[pidx];
+
+ /* This is set under the sp_mutex and rarely ever changes. A
+ * data race here is harmless.
+ */
+ if (data_race(pool->sp_nrthreads))
+ return pool;
+
+ if (++pidx >= serv->sv_nrpools)
+ pidx = 0;
+ }
- return &serv->sv_pools[pidx % serv->sv_nrpools];
+ /* No pool has any threads; nothing can service the transport. */
+ return &serv->sv_pools[pidx];
}
static int svc_rpcb_setup(struct svc_serv *serv, struct net *net)
--
2.54.0