Re: [RFC PATCH V2 5/5] vhost: access vq metadata through kernel virtual address

From: Michael S. Tsirkin
Date: Mon Mar 11 2019 - 23:51:33 EST


On Tue, Mar 12, 2019 at 10:56:20AM +0800, Jason Wang wrote:
>
> On 2019/3/11 äå9:43, Andrea Arcangeli wrote:
> > On Mon, Mar 11, 2019 at 08:48:37AM -0400, Michael S. Tsirkin wrote:
> > > Using copyXuser is better I guess.
> > It certainly would be faster there, but I don't think it's needed if
> > that would be the only use case left that justifies supporting two
> > different models. On small 32bit systems with little RAM kmap won't
> > perform measurably different on 32bit or 64bit systems. If the 32bit
> > host has a lot of ram it all gets slow anyway at accessing RAM above
> > the direct mapping, if compared to 64bit host kernels, it's not just
> > an issue for vhost + mmu notifier + kmap and the best way to optimize
> > things is to run 64bit host kernels.
> >
> > Like Christoph pointed out, the main use case for retaining the
> > copy-user model would be CPUs with virtually indexed not physically
> > tagged data caches (they'll still suffer from the spectre-v1 fix,
> > although I exclude they have to suffer the SMAP
> > slowdown/feature). Those may require some additional flushing than the
> > current copy-user model requires.
> >
> > As a rule of thumb any arch where copy_user_page doesn't define as
> > copy_page will require some additional cache flushing after the
> > kmap. Supposedly with vmap, the vmap layer should have taken care of
> > that (I didn't verify that yet).
>
>
> vmap_page_range()/free_unmap_vmap_area() will call
> fluch_cache_vmap()/flush_cache_vunmap(). So vmap layer should be ok.
>
> Thanks

You only unmap from mmu notifier though.
You don't do it after any access.

>
> >
> > There are some accessories like copy_to_user_page()
> > copy_from_user_page() that could work and obviously defines to raw
> > memcpy on x86 (the main cons is they don't provide word granular
> > access) and at least on sparc they're tailored to ptrace assumptions
> > so then we'd need to evaluate what happens if this is used outside of
> > ptrace context. kmap has been used generally either to access whole
> > pages (i.e. copy_user_page), so ptrace may actually be the only use
> > case with subpage granularity access.
> >
> > #define copy_to_user_page(vma, page, vaddr, dst, src, len) \
> > do { \
> > flush_cache_page(vma, vaddr, page_to_pfn(page)); \
> > memcpy(dst, src, len); \
> > flush_ptrace_access(vma, page, vaddr, src, len, 0); \
> > } while (0)
> >
> > So I wouldn't rule out the need for a dual model, until we solve how
> > to run this stable on non-x86 arches with not physically tagged
> > caches.
> >
> > Thanks,
> > Andrea