Re: virtio ring layout changes for optimal single-stream performance

From: Michael S. Tsirkin
Date: Thu Jan 21 2016 - 14:03:37 EST


On Thu, Jan 21, 2016 at 04:38:36PM +0100, Cornelia Huck wrote:
> On Thu, 21 Jan 2016 15:39:26 +0200
> "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
>
> > Hi all!
> > I have been experimenting with alternative virtio ring layouts,
> > in order to speed up single stream performance.
> >
> > I have just posted a benchmark I wrote for the purpose, and a (partial)
> > alternative layout implementation. This achieves 20-40% reduction in
> > virtio overhead in the (default) polling mode.
> >
> > http://article.gmane.org/gmane.linux.kernel.virtualization/26889
> >
> > The layout is trying to be as simple as possible, to reduce
> > the number of cache lines bouncing between CPUs.
>
> Some kind of diagram or textual description would really help to review
> this.
>
> >
> > For benchmarking, the idea is to emulate virtio in user-space,
> > artificially adding overhead for e.g. signalling to match what happens
> > in case of a VM.
>
> Hm... is this overhead comparable enough between different platform so
> that you can get a halfway realistic scenario?

On x86 is seems pretty stable.
It's a question of setting VMEXIT_CYCLES and VMENTRY_CYCLES correctly.

> What about things like
> endianness conversions?

I didn't bother with them yet.

> >
> > I'd be very curious to get feedback on this, in particular, some people
> > discussed using vectored operations to format virtio ring - would it
> > conflict with this work?
> >
> > You are all welcome to post enhancements or more layout alternatives as
> > patches.
>
> Let me see if I can find time to experiment a bit.