Re: [PATCH 11/11] KVM: MMU: improve write flooding detected

From: Xiao Guangrong
Date: Wed Jul 27 2011 - 06:18:42 EST


On 07/27/2011 05:23 PM, Avi Kivity wrote:
> On 07/26/2011 02:32 PM, Xiao Guangrong wrote:
>> Detecting write-flooding does not work well, when we handle page written, if
>> the last speculative spte is not accessed, we treat the page is
>> write-flooding, however, we can speculative spte on many path, such as pte
>> prefetch, page synced, that means the last speculative spte may be not point
>> to the written page and the written page can be accessed via other sptes, so
>> depends on the Accessed bit of the last speculative spte is not enough
>>
>> Instead of detected page accessed, we can detect whether the spte is accessed
>> or not, if the spte is not accessed but it is written frequently, we treat is
>> not a page table or it not used for a long time
>>
>> static int get_free_pte_list_desc_nr(struct kvm_vcpu *vcpu)
>> {
>> struct kvm_mmu_memory_cache *cache;
>> @@ -3565,22 +3547,14 @@ static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa,
>> * If we're seeing too many writes to a page, it may no longer be a page table,
>> * or we may be forking, in which case it is better to unmap the page.
>> */
>> -static bool detect_write_flooding(struct kvm_vcpu *vcpu, gfn_t gfn)
>> +static bool detect_write_flooding(struct kvm_mmu_page *sp, u64 *spte)
>> {
>> - bool flooded = false;
>> -
>> - if (gfn == vcpu->arch.last_pt_write_gfn
>> - && !last_updated_pte_accessed(vcpu)) {
>> - ++vcpu->arch.last_pt_write_count;
>> - if (vcpu->arch.last_pt_write_count>= 3)
>> - flooded = true;
>> - } else {
>> - vcpu->arch.last_pt_write_gfn = gfn;
>> - vcpu->arch.last_pt_write_count = 1;
>> - vcpu->arch.last_pte_updated = NULL;
>> - }
>> + if (spte&& !(*spte& shadow_accessed_mask))
>> + sp->write_flooding_count++;
>> + else
>> + sp->write_flooding_count = 0;
>>
>> - return flooded;
>> + return sp->write_flooding_count>= 3;
>> }
>
> I think this is a little dangerous. A guest kernel may be instantiating multiple gptes on a page fault, but guest userspace hits only one of them (the one which caused the page fault) - I think Windows does this, but I'm not sure.
>

I think this case is not bad: if the guest kernel need to write multiple gptes (>=3),
it will cause many page fault, we do better zap the shadow page and let it become writable as
soon as possible.
(And, we have pte-fetch, it can quickly establish the mapping for a new shadow page)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/