Re: [PATCH net-next 0/3] vhost: accelerate metadata access through vmap()

From: Jason Wang
Date: Tue Dec 25 2018 - 05:09:32 EST



On 2018/12/25 äå2:12, Michael S. Tsirkin wrote:
On Mon, Dec 24, 2018 at 04:32:39PM +0800, Jason Wang wrote:
On 2018/12/14 äå8:33, Michael S. Tsirkin wrote:
On Fri, Dec 14, 2018 at 11:42:18AM +0800, Jason Wang wrote:
On 2018/12/13 äå11:27, Michael S. Tsirkin wrote:
On Thu, Dec 13, 2018 at 06:10:19PM +0800, Jason Wang wrote:
Hi:

This series tries to access virtqueue metadata through kernel virtual
address instead of copy_user() friends since they had too much
overheads like checks, spec barriers or even hardware feature
toggling.
Userspace accesses through remapping tricks and next time there's a need
for a new barrier we are left to figure it out by ourselves.
I don't get here, do you mean spec barriers?
I mean the next barrier people decide to put into userspace
memory accesses.

It's completely unnecessary for
vhost which is kernel thread.
It's defence in depth. Take a look at the commit that added them.
And yes quite possibly in most cases we actually have a spec
barrier in the validation phase. If we do let's use the
unsafe variants so they can be found.

unsafe variants can only work if you can batch userspace access. This is not
necessarily the case for light load.

Do we care a lot about the light load? How would you benchmark it?


If we can reduce the latency that's will be more than what we expect.

1 byte TCP_RR shows 1.5%-2% improvement.


And even if you're right, vhost is not the
only place, there's lots of vmap() based accessing in kernel.
For sure. But if one can get by without get user pages, one
really should. Witness recently uncovered mess with file
backed storage.

We only pin metadata pages, I don't believe they will be used by any DMA.
It doesn't matter really, if you dirty pages behind the MM back
the problem is there.


Ok, but the usual case is anonymous pages, do we use file backed pages for user of vhost? And even if we use sometime, according to the pointer it's not something that can fix, RFC has been posted to solve this issue.

Thanks