Re: [PATCH v2] sunrpc: hardcode pool_mode to pernode, remove other modes

From: Chuck Lever

Date: Thu Jun 25 2026 - 15:32:57 EST




On Thu, Jun 25, 2026, at 11:59 AM, Jeff Layton wrote:
> The SVC_POOL_AUTO/GLOBAL/PERCPU/PERNODE pool mode selection machinery
> was added when NUMA was new and the right default was unclear. Today,
> pernode is the right choice everywhere:

> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index dd80a2eaaa74..6e3d509bf95a 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -38,19 +38,6 @@
>
> static void svc_unregister(const struct svc_serv *serv, struct net *net);
>
> -#define SVC_POOL_DEFAULT SVC_POOL_GLOBAL
> -
> -/*
> - * Mode for mapping cpus to pools.
> - */
> -enum {
> - SVC_POOL_AUTO = -1, /* choose one of the others */
> - SVC_POOL_GLOBAL, /* no mapping, just a single global pool
> - * (legacy & UP mode) */
> - SVC_POOL_PERCPU, /* one pool per cpu */
> - SVC_POOL_PERNODE /* one pool per numa node */
> -};
> -
> /*
> * Structure for mapping cpus to pools and vice versa.
> * Setup once during sunrpc initialisation.

The commit message should call out the change in the default setting.


> @@ -299,22 +166,11 @@ svc_pool_map_get(void)
> return m->npools;
> }
>
> - if (m->mode == SVC_POOL_AUTO)
> - m->mode = svc_pool_map_choose_mode();
> -
> - switch (m->mode) {
> - case SVC_POOL_PERCPU:
> - npools = svc_pool_map_init_percpu(m);
> - break;
> - case SVC_POOL_PERNODE:
> - npools = svc_pool_map_init_pernode(m);
> - break;
> - }
> -
> + npools = svc_pool_map_init_pernode(m);

With one pool per node now mandatory, can a server end up with pools
that have no threads? svc_set_num_threads() spreads nrservs evenly
across the pools:

net/sunrpc/svc.c:svc_set_num_threads() {
unsigned int base = nrservs / serv->sv_nrpools;
unsigned int remain = nrservs % serv->sv_nrpools;
...
}

so when nrservs is smaller than the number of NUMA nodes with CPUs,
base is 0 and the trailing pools get zero threads.

Work is still routed to the per-node pool selected from the running
CPU:

net/sunrpc/svc_xprt.c:svc_xprt_enqueue() {
pool = svc_pool_for_cpu(xprt->xpt_server);
...
lwq_enqueue(&xprt->xpt_ready, &pool->sp_xprts);
svc_pool_wake_idle_thread(pool);
}

and svc_pool_wake_idle_thread() only consults that one pool's idle
list, with no fallback to a sibling pool. If a connection lands on a
node whose pool has no threads, does its transport sit on sp_xprts and
never get serviced?

The old global default kept a single pool, so this configuration could
not arise without an explicit pernode/percpu selection. With global
mode removed, is there a way left for an admin to avoid empty pools on
a server that runs fewer threads than it has NUMA nodes, or should the
thread distribution or the enqueue path guarantee work only reaches a
populated pool?

Separately, the pool_mode knob is preserved but every accepted value
now behaves as pernode. Documentation/admin-guide/kernel-parameters.txt
still presents the four modes as distinct:

auto the server chooses an appropriate mode
automatically using heuristics
global a single global pool contains all CPUs
percpu one pool for each CPU
pernode one pool for each NUMA node (equivalent
to global on non-NUMA machines)

and states the option "will affect which CPUs will do NFS serving."
The patch should update this documentation.


--
Chuck Lever