Re: virtio-blk: should num_vqs be limited by num_possible_cpus()?

From: Dongli Zhang
Date: Thu Mar 14 2019 - 11:36:49 EST




On 03/14/2019 08:32 PM, Michael S. Tsirkin wrote:
> On Tue, Mar 12, 2019 at 10:22:46AM -0700, Dongli Zhang wrote:
>> I observed that there is one msix vector for config and one shared vector
>> for all queues in below qemu cmdline, when the num-queues for virtio-blk
>> is more than the number of possible cpus:
>>
>> qemu: "-smp 4" while "-device virtio-blk-pci,drive=drive-0,id=virtblk0,num-queues=6"
>
> So why do this?

I observed this when I was testing virtio-blk and block layer.

I just assign a very large 'num-queues' to virtio-blk and keep changing the
number of vcpu in order to study blk-mq.

The num-queues for nvme (qemu) is by default 64, while it is 1 for virtio-blk.

>
>> # cat /proc/interrupts
>> CPU0 CPU1 CPU2 CPU3
>> ... ...
>> 24: 0 0 0 0 PCI-MSI 65536-edge virtio0-config
>> 25: 0 0 0 59 PCI-MSI 65537-edge virtio0-virtqueues
>> ... ...
>>
>>
>> However, when num-queues is the same as number of possible cpus:
>>
>> qemu: "-smp 4" while "-device virtio-blk-pci,drive=drive-0,id=virtblk0,num-queues=4"
>>
>> # cat /proc/interrupts
>> CPU0 CPU1 CPU2 CPU3
>> ... ...
>> 24: 0 0 0 0 PCI-MSI 65536-edge virtio0-config
>> 25: 2 0 0 0 PCI-MSI 65537-edge virtio0-req.0
>> 26: 0 35 0 0 PCI-MSI 65538-edge virtio0-req.1
>> 27: 0 0 32 0 PCI-MSI 65539-edge virtio0-req.2
>> 28: 0 0 0 0 PCI-MSI 65540-edge virtio0-req.3
>> ... ...
>>
>> In above case, there is one msix vector per queue.
>>
>>
>> This is because the max number of queues is not limited by the number of
>> possible cpus.
>>
>> By default, nvme (regardless about write_queues and poll_queues) and
>> xen-blkfront limit the number of queues with num_possible_cpus().
>>
>>
>> Is this by design on purpose, or can we fix with below?
>>
>>
>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>> index 4bc083b..df95ce3 100644
>> --- a/drivers/block/virtio_blk.c
>> +++ b/drivers/block/virtio_blk.c
>> @@ -513,6 +513,8 @@ static int init_vq(struct virtio_blk *vblk)
>> if (err)
>> num_vqs = 1;
>>
>> + num_vqs = min(num_possible_cpus(), num_vqs);
>> +
>> vblk->vqs = kmalloc_array(num_vqs, sizeof(*vblk->vqs), GFP_KERNEL);
>> if (!vblk->vqs)
>> return -ENOMEM;
>> --
>>
>>
>> PS: The same issue is applicable to virtio-scsi as well.
>>
>> Thank you very much!
>>
>> Dongli Zhang
>
> I don't think this will address the issue if there's vcpu hotplug though.
> Because it's not about num_possible_cpus it's about the # of active VCPUs,
> right? Does block hangle CPU hotplug generally?
> We could maybe address that by switching vq to msi vector mapping in
> a cpu hotplug notifier...
>

It looks it is about num_possible_cpus/nr_cpu_ids for cpu hotplug.


For instance, below is when only 2 out of 6 cpus are initialized while
virtio-blk has 6 queues.

"-smp 2,maxcpus=6" and "-device
virtio-blk-pci,drive=drive0,id=disk0,num-queues=6,iothread=io1"

# cat /sys/devices/system/cpu/present
0-1
# cat /sys/devices/system/cpu/possible
0-5
# cat /proc/interrupts | grep virtio
24: 0 0 PCI-MSI 65536-edge virtio0-config
25: 1864 0 PCI-MSI 65537-edge virtio0-req.0
26: 0 1069 PCI-MSI 65538-edge virtio0-req.1
27: 0 0 PCI-MSI 65539-edge virtio0-req.2
28: 0 0 PCI-MSI 65540-edge virtio0-req.3
29: 0 0 PCI-MSI 65541-edge virtio0-req.4
30: 0 0 PCI-MSI 65542-edge virtio0-req.5

6 + 1 irqs are assigned even 4 out of 6 cpus are still offline.


Below is about the nvme emulated by qemu. While 2 out of 6 cpus are initial
assigned, nvme has 64 queues by default.

"-smp 2,maxcpus=6" and "-device nvme,drive=drive1,serial=deadbeaf1"

# cat /sys/devices/system/cpu/present
0-1
# cat /sys/devices/system/cpu/possible
0-5
# cat /proc/interrupts | grep nvme
31: 0 16 PCI-MSI 81920-edge nvme0q0
32: 35 0 PCI-MSI 81921-edge nvme0q1
33: 0 42 PCI-MSI 81922-edge nvme0q2
34: 0 0 PCI-MSI 81923-edge nvme0q3
35: 0 0 PCI-MSI 81924-edge nvme0q4
36: 0 0 PCI-MSI 81925-edge nvme0q5
37: 0 0 PCI-MSI 81926-edge nvme0q6

6 io queues are assigned with irq, although only 2 cpus are online.


When only 2 out of 48 cpus are online, there are 48 hctx created by block layer.

"-smp 2,maxcpus=48" and "-device
virtio-blk-pci,drive=drive0,id=disk0,num-queues=48,iothread=io1"

# ls /sys/kernel/debug/block/vda/ | grep hctx | wc -l
48


The above indicates the number of hw queues/irq is related to
num_possible_cpus/nr_cpu_ids.

Dongli Zhang