Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation
From: Jeremy Fitzhardinge
Date: Wed May 20 2009 - 12:35:37 EST
Ingo Molnar wrote:
* Jan Beulich <JBeulich@xxxxxxxxxx> wrote:
Ingo Molnar <mingo@xxxxxxx> 19.05.09 11:59 >>>
Exactly what is 'bizarre' about using the API defined by the
_CPU_ already, without adding any ad-hoc hypecall? Catch the
dom0 WRMSRs, filter out the MTRR indices - that's it.
But that is *not* the same as using the hypercalls: The hypercall
tells Xen "Change all CPUs' MTRRs with the indicated index to the
indicated value", while the MSR write says "Change the MTRR with
the given index on the physical CPU the current virtual CPU
happens to run on to the given value". [...]
The change of MTRR's on _any_ of the guest CPUs in a dom0 context
should immediately be refected on all CPUs. Assymetric MTRR settings
are madness.
( And the thing is, changing MTRRs is fragile and racy on native
Linux no matter what - even without any hypervisors - due to SMM
contexts possibly relying on them etc. )
[...] A write-base/write-mask pair may happen to get interrupted
(preempted) by the hypervisor, and hence the two writes may happen
on different pCPU-s. Teaching the hypervisor to (correctly!) guess
what the guest meant in that situation isn't trivial, as then it
needs to handle all possible situations (and it can never know
whether Dom0 really intended to do something that may look
bogus/inconsistent at the first glance). [...]
None of this is a problem really if a sane approach is used: a
change to the MTRR state on dom0 is applied symmetrically on all
CPUs.
Or, alternatively, the hypervisor can expose its own administrative
interface to manage MTRRs.
There's no need to fuglify the Linux kernel for that.
I'm not sure what you mean by that, other than as a description of the
current case. The Xen MTRR hypercall:
1. treats MTRR ranges as allocatable resources, and keep track of how
many uses there are of each
2. updates all physical cpus synchronously (ie, the MTRR is not
presented as a property of dom0's virtual CPU, but as a
system-wide resource)
3. prevents guests from setting inconsistent or conflicting MTRRs
Mapping from MSR writes to this interface is moderately complex, because
it requires a mapping from a low-semantic-content interface to a
high-semantic-content interface. It essentially requires parsing the
MSR writes to map them back to the relatively high-level operations at
the mtrr_ops interface and then present that to Xen.
There are at least a couple of secondary issues which arise from that
approach:
* mtrr/generic.c also has to do a number of other things like
disabling caching, tlb flushes, etc. That adds complexity because
Xen guests are never allowed to globally disable caching, so we'd
have to add additional filtering to remove those cr0 writes
* As we've discussed, we'd need to make the mtrr writes implicitly
change all cpus atomically, as the dom0 kernel can't see physical cpus
The net effect would be that we would be making a pile of apparently
generic CPU operations (MSR writes, control register writes) actually
feed a fairly complex parser, increasing the difference between the Xen
and native cases even more.
mtrr/generic.c about 730 lines of fairly intricate arch-specific code.
mtrr/xen.c is 120 lines of straightforward hypercalls. The mtrr_ops
interface and the Xen hypercall interface are a close semantic match, so
there's very little glue code in there.
But that said, this a huge distraction, an unbelievable amount of noise
for a fairly minor point. We can live without these changes, and
they're certainly easy enough to carry out of tree in the meantime. If
you can't live with these changes, then drop them and we'll work out
something else.
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/