On Mon, Jan 07, 2019 at 02:50:17PM +0800, Jason Wang wrote:
On 2019/1/7 äå12:17, Michael S. Tsirkin wrote:Fundamentally yes.
On Mon, Jan 07, 2019 at 11:53:41AM +0800, Jason Wang wrote:Yes.
On 2019/1/7 äå11:28, Michael S. Tsirkin wrote:You mean __uaccess_begin_nospec introduced by
On Mon, Jan 07, 2019 at 10:19:03AM +0800, Jason Wang wrote:It's the effect of removing speculation barrier.
On 2019/1/3 äå4:47, Michael S. Tsirkin wrote:OK so would you say it's really unsafe versus safe accesses?
On Sat, Dec 29, 2018 at 08:46:51PM +0800, Jason Wang wrote:On machine without SMAP (Sandy Bridge):
This series tries to access virtqueue metadata through kernel virtualWill review, thanks!
address instead of copy_user() friends since they had too much
overheads like checks, spec barriers or even hardware feature
toggling.
One questions that comes to mind is whether it's all about bypassing
stac/clac. Could you please include a performance comparison with
nosmap?
Before: 4.8Mpps
After: 5.2Mpps
Or would you say it's just a better written code?
commit 304ec1b050310548db33063e567123fae8fd0301
?
So fundamentally we do access_ok checks when supplying
the memory table to the kernel thread, and we should
do the spec barrier there.
Then we can just create and use a variant of uaccess macros that does
not include the barrier?
The unsafe ones?
Again spec barrier is not needed as such at all. It's defence in depth.Or, how about moving the barrier into access_ok?
This way repeated accesses with a single access_ok get a bit faster.
CC Dan Williams on this idea.
The problem is, e.g for vhost control path. During mem table validation, we
don't even want to access them there. So the spec barrier is not needed.
And mem table init is slow path. So we can stick a barrier there and it
won't be a problem for anyone.
I wonder how expensive can reading eflags be?
How about after + smap off?Let me clarify:On machine with SMAP (Broadwell):no smap being before or after?
Before: 5.0Mpps
After: 6.1Mpps
No smap: 7.5Mpps
Thanks
Before (SMAP on): 5.0Mpps
Before (SMAP off): 7.5Mpps
After (SMAP on): 6.1Mpps
Thanks
After (SMAP off): 8.0Mpps
And maybe we want a module option just for the vhost thread to keep smap
off generally since almost all it does is copy stuff from userspace into
kernel anyway. Because what above numbers should is that we really
really want a solution that isn't limited to just meta-data access,
and I really do not see how any such solution can not also be
used to make meta-data access fast.
As we've discussed in another thread of previous version. This requires lots
of changes, the main issues is SMAP state was not saved/restored on explicit
schedule().
If it's cheap we can just check EFLAGS.AC and rerun stac if needed.
Even if it did, since vhost will call lots of net/block codes,
any kind of uaccess in those codes needs understand this special request
from vhost e.g you provably need to invent a new kinds of iov iterator that
does not touch SMAP at all. And I'm not sure this is the only thing we need
to deal with.
Well we wanted to move packet processing from tun into vhost anyway right?
So I still prefer to:I just guess once you do (2) you will want to rework (1) to use
1) speedup the metadata access through vmap + MMU notifier
2) speedup the datacopy with batched copy (unsafe ones or other new
interfaces)
Thanks
the new interfaces.
So all the effort you are now investing in (1)
will be wasted. Just my $.02.