Re: [Xen-devel] Re: [GIT PULL] xen /proc/mtrr implementation

From: Jeremy Fitzhardinge
Date: Wed May 20 2009 - 12:35:37 EST


Ingo Molnar wrote:
* Jan Beulich <JBeulich@xxxxxxxxxx> wrote:

Ingo Molnar <mingo@xxxxxxx> 19.05.09 11:59 >>>
Exactly what is 'bizarre' about using the API defined by the _CPU_ already, without adding any ad-hoc hypecall? Catch the dom0 WRMSRs, filter out the MTRR indices - that's it.
But that is *not* the same as using the hypercalls: The hypercall tells Xen "Change all CPUs' MTRRs with the indicated index to the indicated value", while the MSR write says "Change the MTRR with the given index on the physical CPU the current virtual CPU happens to run on to the given value". [...]

The change of MTRR's on _any_ of the guest CPUs in a dom0 context should immediately be refected on all CPUs. Assymetric MTRR settings are madness.

( And the thing is, changing MTRRs is fragile and racy on native Linux no matter what - even without any hypervisors - due to SMM contexts possibly relying on them etc. )

[...] A write-base/write-mask pair may happen to get interrupted (preempted) by the hypervisor, and hence the two writes may happen on different pCPU-s. Teaching the hypervisor to (correctly!) guess what the guest meant in that situation isn't trivial, as then it needs to handle all possible situations (and it can never know whether Dom0 really intended to do something that may look bogus/inconsistent at the first glance). [...]

None of this is a problem really if a sane approach is used: a change to the MTRR state on dom0 is applied symmetrically on all CPUs.

Or, alternatively, the hypervisor can expose its own administrative interface to manage MTRRs.

There's no need to fuglify the Linux kernel for that.

I'm not sure what you mean by that, other than as a description of the current case. The Xen MTRR hypercall:

1. treats MTRR ranges as allocatable resources, and keep track of how
many uses there are of each
2. updates all physical cpus synchronously (ie, the MTRR is not
presented as a property of dom0's virtual CPU, but as a
system-wide resource)
3. prevents guests from setting inconsistent or conflicting MTRRs

Mapping from MSR writes to this interface is moderately complex, because it requires a mapping from a low-semantic-content interface to a high-semantic-content interface. It essentially requires parsing the MSR writes to map them back to the relatively high-level operations at the mtrr_ops interface and then present that to Xen.

There are at least a couple of secondary issues which arise from that approach:

* mtrr/generic.c also has to do a number of other things like
disabling caching, tlb flushes, etc. That adds complexity because
Xen guests are never allowed to globally disable caching, so we'd
have to add additional filtering to remove those cr0 writes
* As we've discussed, we'd need to make the mtrr writes implicitly
change all cpus atomically, as the dom0 kernel can't see physical cpus


The net effect would be that we would be making a pile of apparently generic CPU operations (MSR writes, control register writes) actually feed a fairly complex parser, increasing the difference between the Xen and native cases even more.

mtrr/generic.c about 730 lines of fairly intricate arch-specific code. mtrr/xen.c is 120 lines of straightforward hypercalls. The mtrr_ops interface and the Xen hypercall interface are a close semantic match, so there's very little glue code in there.


But that said, this a huge distraction, an unbelievable amount of noise for a fairly minor point. We can live without these changes, and they're certainly easy enough to carry out of tree in the meantime. If you can't live with these changes, then drop them and we'll work out something else.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/