On Mon, Oct 14, 2019 at 09:43:25AM +0800, Jason Wang wrote:
On 2019/10/13 äå4:27, Michael S. Tsirkin wrote:We never supported more than IOV_MAX descriptors.
On Sat, Oct 12, 2019 at 03:28:49PM +0800, Jason Wang wrote:Another possible issue, if we try to batch descriptor chain when we've
On 2019/10/11 äå9:45, Michael S. Tsirkin wrote:Yes, next patch only batches up to 64. But we do need iov_limit because
The idea is to support multiple ring formats by convertingIs iov_limit too much here? It can obviously increase the footprint. I guess
to a format-independent array of descriptors.
This costs extra cycles, but we gain in ability
to fetch a batch of descriptors in one go, which
is good for code cache locality.
To simplify benchmarking, I kept the old code
around so one can switch back and forth by
writing into a module parameter.
This will go away in the final submission.
This patch causes a minor performance degradation,
it's been kept as simple as possible for ease of review.
Next patch gets us back the performance by adding batching.
Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
---
drivers/vhost/test.c | 17 ++-
drivers/vhost/vhost.c | 299 +++++++++++++++++++++++++++++++++++++++++-
drivers/vhost/vhost.h | 16 +++
3 files changed, 327 insertions(+), 5 deletions(-)
diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 056308008288..39a018a7af2d 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -18,6 +18,9 @@
#include "test.h"
#include "vhost.h"
+static int newcode = 0;
+module_param(newcode, int, 0644);
+
/* Max number of bytes transferred before requeueing the job.
* Using this limit prevents one virtqueue from starving others. */
#define VHOST_TEST_WEIGHT 0x80000
@@ -58,10 +61,16 @@ static void handle_vq(struct vhost_test *n)
vhost_disable_notify(&n->dev, vq);
for (;;) {
- head = vhost_get_vq_desc(vq, vq->iov,
- ARRAY_SIZE(vq->iov),
- &out, &in,
- NULL, NULL);
+ if (newcode)
+ head = vhost_get_vq_desc_batch(vq, vq->iov,
+ ARRAY_SIZE(vq->iov),
+ &out, &in,
+ NULL, NULL);
+ else
+ head = vhost_get_vq_desc(vq, vq->iov,
+ ARRAY_SIZE(vq->iov),
+ &out, &in,
+ NULL, NULL);
/* On error, stop handling until the next kick. */
if (unlikely(head < 0))
break;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 36ca2cf419bf..36661d6cb51f 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -301,6 +301,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
struct vhost_virtqueue *vq)
{
vq->num = 1;
+ vq->ndescs = 0;
vq->desc = NULL;
vq->avail = NULL;
vq->used = NULL;
@@ -369,6 +370,9 @@ static int vhost_worker(void *data)
static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
{
+ kfree(vq->descs);
+ vq->descs = NULL;
+ vq->max_descs = 0;
kfree(vq->indirect);
vq->indirect = NULL;
kfree(vq->log);
@@ -385,6 +389,10 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
for (i = 0; i < dev->nvqs; ++i) {
vq = dev->vqs[i];
+ vq->max_descs = dev->iov_limit;
+ vq->descs = kmalloc_array(vq->max_descs,
+ sizeof(*vq->descs),
+ GFP_KERNEL);
the batching can only be done for descriptor without indirect or next set.
Then we may batch 16 or 64.
Thanks
guest can pass a long chain of scatter/gather.
We already have iovecs in a huge array so this does not look like
a big deal. If we ever teach the code to avoid the huge
iov arrays by handling huge s/g lists piece by piece,
we can make the desc array smaller at the same point.
already batched some descriptors, we may reach the limit then some of the
descriptors might need re-read.
Or we may need circular index (head, tail) in this case?
Thanks
And we don't batch more than iov_limit - IOV_MAX.
so buffer never overflows.