Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

From: Jason Wang
Date: Fri Mar 08 2019 - 04:16:11 EST



On 2019/3/8 äå11:45, Jerome Glisse wrote:
On Thu, Mar 07, 2019 at 10:43:12PM -0500, Michael S. Tsirkin wrote:
On Thu, Mar 07, 2019 at 10:40:53PM -0500, Jerome Glisse wrote:
On Thu, Mar 07, 2019 at 10:16:00PM -0500, Michael S. Tsirkin wrote:
On Thu, Mar 07, 2019 at 09:55:39PM -0500, Jerome Glisse wrote:
On Thu, Mar 07, 2019 at 09:21:03PM -0500, Michael S. Tsirkin wrote:
On Thu, Mar 07, 2019 at 02:17:20PM -0500, Jerome Glisse wrote:
It's because of all these issues that I preferred just accessing
userspace memory and handling faults. Unfortunately there does not
appear to exist an API that whitelists a specific driver along the lines
of "I checked this code for speculative info leaks, don't add barriers
on data path please".
Maybe it would be better to explore adding such helper then remapping
page into kernel address space ?
I explored it a bit (see e.g. thread around: "__get_user slower than
get_user") and I can tell you it's not trivial given the issue is around
security. So in practice it does not seem fair to keep a significant
optimization out of kernel because *maybe* we can do it differently even
better :)
Maybe a slightly different approach between this patchset and other
copy user API would work here. What you want really is something like
a temporary mlock on a range of memory so that it is safe for the
kernel to access range of userspace virtual address ie page are
present and with proper permission hence there can be no page fault
while you are accessing thing from kernel context.

So you can have like a range structure and mmu notifier. When you
lock the range you block mmu notifier to allow your code to work on
the userspace VA safely. Once you are done you unlock and let the
mmu notifier go on. It is pretty much exactly this patchset except
that you remove all the kernel vmap code. A nice thing about that
is that you do not need to worry about calling set page dirty it
will already be handle by the userspace VA pte. It also use less
memory than when you have kernel vmap.

This idea might be defeated by security feature where the kernel is
running in its own address space without the userspace address
space present.
Like smap?
Yes like smap but also other newer changes, with similar effect, since
the spectre drama.

Cheers,
JÃrÃme
Sorry do you mean meltdown and kpti?
Yes all that and similar thing. I do not have the full list in my head.

Cheers,
JÃrÃme


Yes, address space of kernel its own is the main motivation of using vmap here.

Thanks