Re: [PATCH v4 06/10] KVM: MMU: fast path of handling guest page fault

From: Marcelo Tosatti
Date: Wed May 02 2012 - 17:12:18 EST


On Wed, May 02, 2012 at 01:28:39PM +0800, Xiao Guangrong wrote:
> On 05/01/2012 09:34 AM, Marcelo Tosatti wrote:
>
>
> >
> > It is getting better, but not yet, there are still reads of sptep
> > scattered all over (as mentioned before, i think a pattern of read spte
> > once, work on top of that, atomically write and then deal with results
> > _everywhere_ (where mmu lock is held) is more consistent.
> >
>
>
> But we only need care the path which depends on is_writable_pte(), no?

Yes.

> So, where call is_writable_pte() are spte_has_volatile_bits(),
> spte_write_protect() and set_spte().
>
> I have changed these functions:
> In spte_has_volatile_bits():
> static bool spte_has_volatile_bits(u64 spte)
> {
> + /*
> + * Always atomicly update spte if it can be updated
> + * out of mmu-lock.
> + */
> + if (spte_can_lockless_update(spte))
> + return true;
> +
>
> In spte_write_protect():
>
> + spte = mmu_spte_update(sptep, spte);
> +
> + if (is_writable_pte(spte))
> + *flush |= true;
> +
> The 'spte' is from atomically read-write (xchg).
>
> in set_spte():
> set_pte:
> - mmu_spte_update(sptep, spte);
> + entry = mmu_spte_update(sptep, spte);
> /*
> * If we overwrite a writable spte with a read-only one we
> * should flush remote TLBs. Otherwise rmap_write_protect
> The 'entry' is also the latest value.
>
> > /*
> > * If we overwrite a writable spte with a read-only one we
> > * should flush remote TLBs. Otherwise rmap_write_protect
> > * will find a read-only spte, even though the writable spte
> > * might be cached on a CPU's TLB.
> > */
> > if (is_writable_pte(entry) && !is_writable_pte(*sptep))
> > kvm_flush_remote_tlbs(vcpu->kvm);
> >
> > This is inconsistent with the above obviously.
> >
>
>
> 'entry' is not a problem since it is from atomically read-write as
> mentioned above, i need change this code to:
>
> /*
> * Optimization: for pte sync, if spte was writable the hash
> * lookup is unnecessary (and expensive). Write protection
> * is responsibility of mmu_get_page / kvm_sync_page.
> * Same reasoning can be applied to dirty page accounting.
> */
> if (!can_unsync && is_writable_pte(entry) /* Use 'entry' instead of '*sptep'. */
> goto set_pte
> ......
>
>
> if (is_writable_pte(entry) && !is_writable_pte(spte)) /* Use 'spte' instead of '*sptep'. */
> kvm_flush_remote_tlbs(vcpu->kvm);

What is of more importance than the ability to verify that this or that
particular case are ok at the moment is to write code in such a way that
its easy to verify that it is correct.

Thus the suggestion above:

"scattered all over (as mentioned before, i think a pattern of read spte
once, work on top of that, atomically write and then deal with results
_everywhere_ (where mmu lock is held) is more consistent."


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/