Re: [PATCH 14/15] KVM: MTRR: do not map huage page for non-consistent range

From: Xiao Guangrong
Date: Fri Jun 05 2015 - 02:42:20 EST



[ CCed Zhang Yang ]

On 06/04/2015 04:36 PM, Paolo Bonzini wrote:


On 04/06/2015 10:23, Xiao Guangrong wrote:

So, why do you need to always use IPAT=0? Can patch 15 keep the current
logic for RAM, like this:

if (is_mmio || kvm_arch_has_noncoherent_dma(vcpu->kvm))
ret = kvm_mtrr_get_guest_memory_type(vcpu, gfn) <<
VMX_EPT_MT_EPTE_SHIFT;
else
ret = (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT)
| VMX_EPT_IPAT_BIT;

Yeah, it's okay, actually we considered this way, however
- it's light enough, it did not hurt guest performance based on our
benchmark.
- the logic has always used for noncherent_dma case, extend it to
normal case should have low risk and also help us to check the logic.

But noncoherent_dma is not the common case, so it's not necessarily true
that the risk is low.

I thought noncoherent_dma exists on 1st generation(s) IOMMU, it should
be fully tested at that time.


- completely follow MTRRS spec would be better than host hides it.

We are a virtualization platform, we know well when MTRRs are necessary.

Tis a risk from blindly obeying the guest MTRRs: userspace can see stale
data if the guest's accesses bypass the cache. AMD bypasses this by
enabling snooping even in cases that ordinarily wouldn't snoop; for
Intel the solution is that RAM-backed areas should always use IPAT.

Not sure if UC and other cacheable type combinations on guest and host
will cause problem. The SMD mentioned that snoop is not required only when
"The UC attribute comes from the MTRRs and the processors are not required
to snoop their caches since the data could never have been cached."
(Vol 3. 11.5.2.2)
VMX do not touch hardware MTRR MSRs and i guess snoop works under this case.

I also noticed if SS (self-snooping) is supported we need not to invalidate
cache when programming memory type (Vol 3. 11.11.8), so that means CPU works
well on the page which has different cache types i guess.

After think it carefully, we (Zhang Yang) doubt if always set WB for DMA
memory is really a good idea because we can not assume WB DMA works well for
all devices. One example is that audio DMA (not a MMIO region) is required WC
to improve its performance.

However, we think the SDM is not clear enough so let's do full vMTRR on MMIO
and noncoherent_dma first.ã:)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/