Re: [PATCH] blk-mq: nvme: Fix general protection fault in nvme_setup_descriptor_pools()

From: Sungwoo Kim

Date: Mon Mar 09 2026 - 20:01:01 EST


On Mon, Mar 9, 2026 at 11:31 AM Caleb Sander Mateos
<csander@xxxxxxxxxxxxxxx> wrote:
>
> On Sun, Mar 8, 2026 at 11:30 PM Sungwoo Kim <iam@xxxxxxxxxxxx> wrote:
> >
> > The numa_node can be < 0 since NUMA_NO_NODE = -1. However,
> > struct blk_mq_hw_ctx{} defines numa_node as unsigned int. As a result,
> > numa_node is set to UINT_MAX for NUMA_NO_NODE in blk_mq_alloc_hctx().
>
> The node argument to blk_mq_alloc_hctx() comes from
> blk_mq_alloc_and_init_hctx(), which is called by
> blk_mq_alloc_and_init_hctx() with int node = blk_mq_get_hctx_node(set,
> i). node = NUMA_NO_NODE would suggest that blk_mq_hw_queue_to_node()
> doesn't find any CPU affinitized to the queue. Is that even possible?

Thanks for your review, Celeb.

blk_mq_hw_queue_to_node() can return NUMA_NO_NODE if the device queues
exceed the
number of CPUs. Afterward, it is adjusted on the caller side to
numa_node = set->numa_node.

set->numa_node can still be NUMA_NO_NODE if CONFIG_NUMA=n (trivial) or
pcibus_to_node() returns NUMA_NO_NODE if ACPI doesn't provide
proximity information.
But I'm not sure if this is usual in the real machines. We found the
crash in QEMU.

> > static struct nvme_descriptor_pools *
> > -nvme_setup_descriptor_pools(struct nvme_dev *dev, unsigned numa_node)
> > +nvme_setup_descriptor_pools(struct nvme_dev *dev, int numa_node)
> > {
> > - struct nvme_descriptor_pools *pools = &dev->descriptor_pools[numa_node];
> > + struct nvme_descriptor_pools *pools;
> > size_t small_align = NVME_SMALL_POOL_SIZE;
> >
> > + if (numa_node == NUMA_NO_NODE)
> > + pools = &dev->descriptor_pools[numa_node_id()];
> > + else
> > + pools = &dev->descriptor_pools[numa_node];
>
> Simpler: if (numa_node == NUMA_NO_NODE) numa_node = numa_node_id();
>

Thanks, I will modify it in V2.

Sungwoo.