Re: kvm virtio ethernet ring on guest side over high throughput(packet per second)

From: Alejandro Comisario
Date: Fri Jan 24 2014 - 13:41:18 EST

Next message: Laszlo Papp: "Re: [PATCH v2 1/2] Create eeprom_dev hardware class for EEPROM devices"
Previous message: Dave Jones: "Fix ccp_run_passthru_cmd dma variable assignments"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Well, its confirmed that ... because of the shape of our traffic,
constant burst of many many small packages (1.5k / 3.5k) Nagle
algorithm was in a beggining the root cause of our performance issues.

So i will have this thread as solved.
Thank you so much to everyone involved, specially people from RedHat.
Thanks a lot!

Alejandro Comisario
#melicloud CloudBuilders
Arias 3751, Piso 7 (C1430CRG)
Ciudad de Buenos Aires - Argentina
Cel: +549(11) 15-3770-1857
Tel : +54(11) 4640-8443

On Thu, Jan 23, 2014 at 4:25 PM, Alejandro Comisario
<alejandro.comisario@xxxxxxxxxxxxxxxx> wrote:
> Jason, Stefan ... thank you so much.
> At a glance, disabling Nagle algorithm made the hundred thousands
> "20ms" delays to dissapear suddenly, tomorrow we are gonna made a
> "whole day" test again, and test client connectivity against "NginX
> and Memcached" to see if, because of the traffic we have (hundred
> thousands packages per minute) Nagle introduced this delay.
>
> I'll get back to you tomorrow with the tests.
> Thanks again.
>
> kindest regards.
>
>
> Alejandro Comisario
> #melicloud CloudBuilders
> Arias 3751, Piso 7 (C1430CRG)
> Ciudad de Buenos Aires - Argentina
> Cel: +549(11) 15-3770-1857
> Tel : +54(11) 4640-8443
>
>
> On Thu, Jan 23, 2014 at 12:14 AM, Jason Wang <jasowang@xxxxxxxxxx> wrote:
>> On 01/23/2014 05:32 AM, Alejandro Comisario wrote:
>>> Thank you so much Stefan for the help and cc'ing Michael & Jason.
>>> Like you advised yesterday on IRC, today we are making some tests with
>>> the application setting TCP_NODELAY in the socket options.
>>>
>>> So we will try that and get back to you with further information.
>>> In the mean time, maybe showing what options the vms are using while running !
>>>
>>> # ------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> /usr/bin/kvm -S -M pc-1.0 -cpu
>>> core2duo,+lahf_lm,+rdtscp,+pdpe1gb,+aes,+popcnt,+x2apic,+sse4.2,+sse4.1,+dca,+xtpr,+cx16,+tm2,+est,+vmx,+ds_cpl,+pbe,+tm,+ht,+ss,+acpi,+ds
>>> -enable-kvm -m 32768 -smp 8,sockets=1,cores=6,threads=2 -name
>>> instance-00000254 -uuid d25b1b20-409e-4d7f-bd92-2ef4073c7c2b
>>> -nodefconfig -nodefaults -chardev
>>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-00000254.monitor,server,nowait
>>> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
>>> -no-shutdown -kernel /var/lib/nova/instances/instance-00000254/kernel
>>> -initrd /var/lib/nova/instances/instance-00000254/ramdisk -append
>>> root=/dev/vda console=ttyS0 -drive
>>> file=/var/lib/nova/instances/instance-00000254/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=writethrough
>>> -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
>>> -netdev tap,fd=19,id=hostnet0 -device
>>
>> Better enable vhost as Stefan suggested. It may help a lot here.
>>> virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:27:d4:6d,bus=pci.0,addr=0x3
>>> -chardev file,id=charserial0,path=/var/lib/nova/instances/instance-00000254/console.log
>>> -device isa-serial,chardev=charserial0,id=serial0 -chardev
>>> pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1
>>> -usb -device usb-tablet,id=input0 -vnc 0.0.0.0:4 -k en-us -vga cirrus
>>> -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
>>> # ------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>
>>> best regards
>>>
>>>
>>> Alejandro Comisario
>>> #melicloud CloudBuilders
>>> Arias 3751, Piso 7 (C1430CRG)
>>> Ciudad de Buenos Aires - Argentina
>>> Cel: +549(11) 15-3770-1857
>>> Tel : +54(11) 4640-8443
>>>
>>>
>>> On Wed, Jan 22, 2014 at 12:22 PM, Stefan Hajnoczi <stefanha@xxxxxxxxx> wrote:
>>>> On Tue, Jan 21, 2014 at 04:06:05PM -0200, Alejandro Comisario wrote:
>>>>
>>>> CCed Michael Tsirkin and Jason Wang who work on KVM networking.
>>>>
>>>>> Hi guys, we had in the past when using physical servers, several
>>>>> throughput issues regarding the throughput of our APIS, in our case we
>>>>> measure this with packets per seconds, since we dont have that much
>>>>> bandwidth (Mb/s) since our apis respond lots of packets very small
>>>>> ones (maximum response of 3.5k and avg response of 1.5k), when we
>>>>> where using this physical servers, when we reach throughput capacity
>>>>> (due to clients tiemouts) we touched the ethernet ring configuration
>>>>> and we made the problem dissapear.
>>>>>
>>>>> Today with kvm and over 10k virtual instances, when we want to
>>>>> increase the throughput of KVM instances, we bumped with the fact that
>>>>> when using virtio on guests, we have a max configuration of the ring
>>>>> of 256 TX/RX, and from the host side the atached vnet has a txqueuelen
>>>>> of 500.
>>>>>
>>>>> What i want to know is, how can i tune the guest to support more
>>>>> packets per seccond if i know that's my bottleneck?
>>>> I suggest investigating performance in a systematic way. Set up a
>>>> benchmark that saturates the network. Post the details of the benchmark
>>>> and the results that you are seeing.
>>>>
>>>> Then, we can discuss how to investigate the root cause of the bottleneck.
>>>>
>>>>> * does virtio exposes more packets to configure in the virtual ethernet's ring ?
>>>> No, ring size is hardcoded in QEMU (on the host).
>>>>
>>>>> * does the use of vhost_net helps me with increasing packets per
>>>>> second and not only bandwidth?
>>>> vhost_net is generally the most performant network option.
>>>>
>>>>> does anyone has to struggle with this before and knows where i can look into ?
>>>>> there's LOOOOOOOOOOOOOOOTS of information about networking performance
>>>>> tuning of kvm, but nothing related to increase throughput in pps
>>>>> capacity.
>>>>>
>>>>> This is a couple of configurations that we are having right now on the
>>>>> compute nodes:
>>>>>
>>>>> * 2x1Gb bonded interfaces (want to know the more than 20 models we are
>>>>> using, just ask for it)
>>>>> * Multi queue interfaces, pined via irq to different cores
>>>>> * Linux bridges, no VLAN, no open-vswitch
>>>>> * ubuntu 12.04 kernel 3.2.0-[40-48]
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at http://www.tux.org/lkml/
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Laszlo Papp: "Re: [PATCH v2 1/2] Create eeprom_dev hardware class for EEPROM devices"
Previous message: Dave Jones: "Fix ccp_run_passthru_cmd dma variable assignments"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]