On Wed, Sep 27, 2017 at 08:35:47AM +0800, Jason Wang wrote:
On 2017å09æ27æ 03:19, Michael S. Tsirkin wrote:
On Fri, Sep 22, 2017 at 04:02:32PM +0800, Jason Wang wrote:For better performance
This patch introduces vhost_prefetch_desc_indices() which could batchwhy do you need to combine used update with prefetch?
descriptor indices fetching and used ring updating. This intends to
reduce the cache misses of indices fetching and updating and reduce
cache line bounce when virtqueue is almost full. copy_to_user() was
used in order to benefit from modern cpus that support fast string
copy. Batched virtqueue processing will be the first user.
Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx>
---
drivers/vhost/vhost.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++
drivers/vhost/vhost.h | 3 +++
2 files changed, 58 insertions(+)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index f87ec75..8424166d 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2437,6 +2437,61 @@ struct vhost_msg_node *vhost_dequeue_msg(struct vhost_dev *dev,
}
EXPORT_SYMBOL_GPL(vhost_dequeue_msg);
+int vhost_prefetch_desc_indices(struct vhost_virtqueue *vq,
+ struct vring_used_elem *heads,
+ u16 num, bool used_update)
Why is sticking a branch in there better than requesting the update
conditionally from the caller?
and I believe we don't care about the overhead whenThat's a separate question, I do not really understand how
we meet errors in tx.
you can fetch a descriptor and update the used ring at the same
time. This allows the guest to overwrite the buffer.
I might be misunderstanding what is going on here though.