Re: [RFC PATCH v1] virtio_pci: only store successfully populated virtio_pci_vq_info

From: Link Lin

Date: Tue Apr 21 2026 - 17:48:03 EST


Hi everyone,

Friendly ping. Apologies if you are getting this the second time - my
last ping wasn't in plain text mode and got rejected by some mailing
lists.

Please let me know if anyone has had a chance to look at this RFC
patch, or if any changes are needed.

Thanks,
Link

On Tue, Apr 21, 2026 at 2:24 PM Link Lin <linkl@xxxxxxxxxx> wrote:
>
> Hi everyone,
>
> Friendly ping on this RFC patch. Please let me know if anyone has had a chance to look at this, or if any changes are needed.
>
> Thanks,
> Link
>
> On Tue, Apr 7, 2026 at 2:25 PM Link Lin <linkl@xxxxxxxxxx> wrote:
>>
>> In environments where free page reporting is disabled, a kernel
>> panic is triggered when tearing down the virtio_balloon module:
>>
>> [12261.808190] Call trace:
>> [12261.808471] __list_del_entry_valid_or_report+0x18/0xe0
>> [12261.809064] vp_del_vqs+0x12c/0x270
>> [12261.809462] remove_common+0x80/0x98 [virtio_balloon]
>> [12261.810034] virtballoon_remove+0xfc/0x158 [virtio_balloon]
>> [12261.810663] virtio_dev_remove+0x68/0xf8
>> [12261.811108] device_release_driver_internal+0x17c/0x278
>> [12261.811701] driver_detach+0xd4/0x138
>> [12261.812117] bus_remove_driver+0x90/0xd0
>> [12261.812562] driver_unregister+0x40/0x70
>> [12261.813006] unregister_virtio_driver+0x20/0x38
>> [12261.813518] cleanup_module+0x20/0x7a8 [virtio_balloon]
>> [12261.814109] __arm64_sys_delete_module+0x278/0x3d0
>> [12261.814654] invoke_syscall+0x5c/0x120
>> [12261.815086] el0_svc_common+0x90/0xf8
>> [12261.815506] do_el0_svc+0x2c/0x48
>> [12261.815883] el0_svc+0x3c/0xa8
>> [12261.816235] el0t_64_sync_handler+0x8c/0x108
>> [12261.816724] el0t_64_sync+0x198/0x1a0
>>
>> The issue originates in vp_find_vqs_intx(). It kzalloc_objs() based
>> on the nvqs count provided by the caller, virtio_balloon::init_vqs().
>> However, it is not always the case that all nvqs number of
>> virtio_pci_vq_info objects will be properly populated.
>>
>> For example, when VIRTIO_BALLOON_F_FREE_PAGE_HINT is absent, the
>> VIRTIO_BALLOON_VQ_FREE_PAGE-th item in the vp_dev->vqs array is
>> actually never populated, and is still a zeroe-initialized
>> virtio_pci_vq_info object, which is eventually going to trigger
>> a __list_del_entry_valid_or_report() crash.
>>
>> Tested by applying this patch to a guest VM kernel with the
>> VIRTIO_BALLOON_F_REPORTING feature enabled and the
>> VIRTIO_BALLOON_F_FREE_PAGE_HINT feature disabled.
>> Without this patch, unloading the virtio_balloon module triggers a panic.
>> With this patch, no panic is observed.
>>
>> The fix is to use queue_idx to handle the case that vp_find_vqs_intx()
>> skips vp_setup_vq() when caller provided null vqs_info[i].name, when
>> the caller doesn't populate all nvqs number of virtqueue_info objects.
>> Invariantly queue_idx is the correct index to store a successfully
>> created and populated virtio_pci_vq_info object. As a result, now
>> a virtio_pci_device object only stores queue_idx number of valid
>> virtio_pci_vq_info objects in its vqs array when the for-loop over
>> nvqs finishes (of course, without goto out_del_vqs).
>>
>> vp_find_vqs_msix() has similar issue, so fix it in the same way.
>>
>> This patch is marked as RFC because we are uncertain if any virtio-pci
>> code implicitly requires virtio_pci_device's vqs array to always
>> contain nvqs number of virtio_pci_vq_info objects, and to store
>> zero-initialized virtio_pci_vq_info objects. We have not observed
>> any issues in our testing, but insights or alternatives are welcome!
>>
>> Signed-off-by: Link Lin <linkl@xxxxxxxxxx>
>> Co-developed-by: Jiaqi Yan <jiaqiyan@xxxxxxxxxx>
>> Signed-off-by: Jiaqi Yan <jiaqiyan@xxxxxxxxxx>
>> ---
>> drivers/virtio/virtio_pci_common.c | 10 ++++++----
>> 1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
>> index da97b6a988de..9b32301529e5 100644
>> --- a/drivers/virtio/virtio_pci_common.c
>> +++ b/drivers/virtio/virtio_pci_common.c
>> @@ -423,14 +423,15 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned int nvqs,
>> vqs[i] = NULL;
>> continue;
>> }
>> - vqs[i] = vp_find_one_vq_msix(vdev, queue_idx++, vqi->callback,
>> + vqs[i] = vp_find_one_vq_msix(vdev, queue_idx, vqi->callback,
>> vqi->name, vqi->ctx, false,
>> &allocated_vectors, vector_policy,
>> - &vp_dev->vqs[i]);
>> + &vp_dev->vqs[queue_idx]);
>> if (IS_ERR(vqs[i])) {
>> err = PTR_ERR(vqs[i]);
>> goto error_find;
>> }
>> + ++queue_idx;
>> }
>>
>> if (!avq_num)
>> @@ -485,13 +486,14 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned int nvqs,
>> vqs[i] = NULL;
>> continue;
>> }
>> - vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback,
>> + vqs[i] = vp_setup_vq(vdev, queue_idx, vqi->callback,
>> vqi->name, vqi->ctx,
>> - VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[i]);
>> + VIRTIO_MSI_NO_VECTOR, &vp_dev->vqs[queue_idx]);
>> if (IS_ERR(vqs[i])) {
>> err = PTR_ERR(vqs[i]);
>> goto out_del_vqs;
>> }
>> + ++queue_idx;
>> }
>>
>> if (!avq_num)
>> --
>> 2.53.0.1213.gd9a14994de-goog