Re: [PATCH RFC] vm_unmap_aliases: allow callers to inhibit TLB flush

From: Jeremy Fitzhardinge
Date: Thu Feb 19 2009 - 12:02:53 EST


Nick Piggin wrote:
On Wednesday 18 February 2009 08:57:56 Jeremy Fitzhardinge wrote:
Nick Piggin wrote:
I have patches to move the tlb flushing to an asynchronous process
context... but all tweaks to that (including flushing at vmap) are just
variations on the existing flushing scheme and don't solve your problem,
so I don't think we really need to change that for the moment (my patches
are mainly for latency improvement and to allow vunmap to be usable from
interrupt context).
Hi Nick,

I'm very interested in being able to call vm_unmap_aliases() from
interrupt context. Does the work you mention here encompass that?

No, and it can't because we can't do the global kernel tlb flush
from interrupt context.

There is basically no point in doing the vm_unmap_aliases from
interrupt context without doing the global TLB flush as well,
because you still cannot reuse the virtual memory, you still have
possible aliases to it, and you still need to schedule a TLB flush
at some point anyway.

But that's only an issue when you actually do want to reuse the virtual address space. Couldn't you set a flag saying "tlb flush needed", so when cpu X is about to use some of that address space, it flushes first? Avoids the need for synchronous cross-cpu tlb flushes. It assumes they're not currently using that address space, but I think that would indicate a bug anyway.

(Xen does something like this internally to either defer or avoid many expensive tlb operations.)

For Xen dom0, when someone does something like dma_alloc_coherent, we
allocate the memory as normal, and then swizzle the underlying physical
pages to be machine physically contiguous (vs contiguous pseudo-physical
guest memory), and within the addressable range for the device. In
order to do that, we need to make sure the pages are only mapped by the
linear mapping, and there are no other aliases.

These are just stale aliases that will no longer be operated on
unless there is a kernel bug -- so can you just live with them,
or is it a security issue of memory access escaping its domain?

The underlying physical page is being exchanged, so the old page is being returned to Xen's free page pool. It will refuse to do the exchange if the guest still has pagetable references to the page.


And since drivers are free to allocate dma memory at interrupt time,
this needs to happen at interrupt time too.

(The tlb flush issue that started this read should be a non-issue for
Xen, at least, because all cross-cpu tlb flushes should happen via a
hypercall rather than kernel-initiated IPIs, so there's no possibility
of deadlock. Though I'll happily admit that taking advantage of the
implementation properties of a particular implementation is not very
pretty...)

If it is really no other way around it, it would be possible to
allow arch code to take advantage of this if it knows its TLB
flush is interrupt safe.

It's almost safe. I've got this patch in my tree to tie up the flush_tlb_all loose end, though I won't claim its pretty.