Re: [PATCH net-next V4 5/5] vhost: access vq metadata through kernel virtual address

From: Jason Wang
Date: Wed Jan 23 2019 - 23:11:40 EST



On 2019/1/24 äå12:07, Jason Wang wrote:

On 2019/1/23 äå10:08, Michael S. Tsirkin wrote:
On Wed, Jan 23, 2019 at 05:55:57PM +0800, Jason Wang wrote:
It was noticed that the copy_user() friends that was used to access
virtqueue metdata tends to be very expensive for dataplane
implementation like vhost since it involves lots of software checks,
speculation barrier, hardware feature toggling (e.g SMAP). The
extra cost will be more obvious when transferring small packets since
the time spent on metadata accessing become more significant.

This patch tries to eliminate those overheads by accessing them
through kernel virtual address by vmap(). To make the pages can be
migrated, instead of pinning them through GUP, we use MMU notifiers to
invalidate vmaps and re-establish vmaps during each round of metadata
prefetching if necessary. For devices that doesn't use metadata
prefetching, the memory accessors fallback to normal copy_user()
implementation gracefully. The invalidation was synchronized with
datapath through vq mutex, and in order to avoid hold vq mutex during
range checking, MMU notifier was teared down when trying to modify vq
metadata.

Another thing is kernel lacks efficient solution for tracking dirty
pages by vmap(), this will lead issues if vhost is using file backed
memory which needs care of writeback. This patch solves this issue by
just skipping the vma that is file backed and fallback to normal
copy_user() friends. This might introduce some overheads for file
backed users but consider this use case is rare we could do
optimizations on top.

Note that this was only done when device IOTLB is not enabled. We
could use similar method to optimize it in the future.

Tests shows at most about 22% improvement on TX PPS when using
virtio-user + vhost_net + xdp1 + TAP on 2.6GHz Broadwell:

ÂÂÂÂÂÂÂÂ SMAP on | SMAP off
Before: 5.0Mpps | 6.6Mpps
After:Â 6.1Mpps | 7.4Mpps

Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx>

So this is the bulk of the change.
Threee things that I need to look into
- Are there any security issues with bypassing the speculation barrier
ÂÂ that is normally present after access_ok?


If we can make sure the bypassing was only used in a kthread (vhost), it should be fine I think.


- How hard does the special handling for
ÂÂ file backed storage make testing?


It's as simple as un-commenting vhost_can_vmap()? Or I can try to hack qemu or dpdk to test this.


ÂÂ On the one hand we could add a module parameter to
ÂÂ force copy to/from user. on the other that's
ÂÂ another configuration we need to support.


That sounds sub-optimal since it leave the choice to users.


ÂÂ But iotlb is not using vmap, so maybe that's enough
ÂÂ for testing.
- How hard is it to figure out which mode uses which code.


It's as simple as tracing __get_user() usage in vhost process?

Thanks





Meanwhile, could you pls post data comparing this last patch with the
below? This removes the speculation barrier replacing it with a
(useless but at least more lightweight) data dependency.


SMAP off

Your patch: 7.2MPPs

vmap: 7.4Mpps

I don't test SMAP on, since it will be much slow for sure.

Thanks



Thanks!


diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index bac939af8dbb..352ee7e14476 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -739,7 +739,7 @@ static int vhost_copy_to_user(struct vhost_virtqueue *vq, void __user *to,
ÂÂÂÂÂ int ret;
 Â if (!vq->iotlb)
-ÂÂÂÂÂÂÂ return __copy_to_user(to, from, size);
+ÂÂÂÂÂÂÂ return copy_to_user(to, from, size);
ÂÂÂÂÂ else {
ÂÂÂÂÂÂÂÂÂ /* This function should be called after iotlb
ÂÂÂÂÂÂÂÂÂÂ * prefetch, which means we're sure that all vq
@@ -752,7 +752,7 @@ static int vhost_copy_to_user(struct vhost_virtqueue *vq, void __user *to,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ VHOST_ADDR_USED);
 Â if (uaddr)
-ÂÂÂÂÂÂÂÂÂÂÂ return __copy_to_user(uaddr, from, size);
+ÂÂÂÂÂÂÂÂÂÂÂ return copy_to_user(uaddr, from, size);
 Â ret = translate_desc(vq, (u64)(uintptr_t)to, size, vq->iotlb_iov,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ARRAY_SIZE(vq->iotlb_iov),
@@ -774,7 +774,7 @@ static int vhost_copy_from_user(struct vhost_virtqueue *vq, void *to,
ÂÂÂÂÂ int ret;
 Â if (!vq->iotlb)
-ÂÂÂÂÂÂÂ return __copy_from_user(to, from, size);
+ÂÂÂÂÂÂÂ return copy_from_user(to, from, size);
ÂÂÂÂÂ else {
ÂÂÂÂÂÂÂÂÂ /* This function should be called after iotlb
ÂÂÂÂÂÂÂÂÂÂ * prefetch, which means we're sure that vq
@@ -787,7 +787,7 @@ static int vhost_copy_from_user(struct vhost_virtqueue *vq, void *to,
ÂÂÂÂÂÂÂÂÂ struct iov_iter f;
 Â if (uaddr)
-ÂÂÂÂÂÂÂÂÂÂÂ return __copy_from_user(to, uaddr, size);
+ÂÂÂÂÂÂÂÂÂÂÂ return copy_from_user(to, uaddr, size);
 Â ret = translate_desc(vq, (u64)(uintptr_t)from, size, vq->iotlb_iov,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ARRAY_SIZE(vq->iotlb_iov),
@@ -855,13 +855,13 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
 ({ \
ÂÂÂÂÂ int ret = -EFAULT; \
ÂÂÂÂÂ if (!vq->iotlb) { \
-ÂÂÂÂÂÂÂ ret = __put_user(x, ptr); \
+ÂÂÂÂÂÂÂ ret = put_user(x, ptr); \
ÂÂÂÂÂ } else { \
ÂÂÂÂÂÂÂÂÂ __typeof__(ptr) to = \
ÂÂÂÂÂÂÂÂÂÂÂÂÂ (__typeof__(ptr)) __vhost_get_user(vq, ptr,ÂÂÂ \
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ sizeof(*ptr), VHOST_ADDR_USED); \
ÂÂÂÂÂÂÂÂÂ if (to != NULL) \
-ÂÂÂÂÂÂÂÂÂÂÂ ret = __put_user(x, to); \
+ÂÂÂÂÂÂÂÂÂÂÂ ret = put_user(x, to); \
ÂÂÂÂÂÂÂÂÂ else \
ÂÂÂÂÂÂÂÂÂÂÂÂÂ ret = -EFAULT;ÂÂÂ \
ÂÂÂÂÂ } \
@@ -872,14 +872,14 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
 ({ \
ÂÂÂÂÂ int ret; \
ÂÂÂÂÂ if (!vq->iotlb) { \
-ÂÂÂÂÂÂÂ ret = __get_user(x, ptr); \
+ÂÂÂÂÂÂÂ ret = get_user(x, ptr); \
ÂÂÂÂÂ } else { \
ÂÂÂÂÂÂÂÂÂ __typeof__(ptr) from = \
ÂÂÂÂÂÂÂÂÂÂÂÂÂ (__typeof__(ptr)) __vhost_get_user(vq, ptr, \
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ sizeof(*ptr), \
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ type); \
ÂÂÂÂÂÂÂÂÂ if (from != NULL) \
-ÂÂÂÂÂÂÂÂÂÂÂ ret = __get_user(x, from); \
+ÂÂÂÂÂÂÂÂÂÂÂ ret = get_user(x, from); \
ÂÂÂÂÂÂÂÂÂ else \
ÂÂÂÂÂÂÂÂÂÂÂÂÂ ret = -EFAULT; \
ÂÂÂÂÂ } \