Can this in fact work for level != PT_PAGE_TABLE_LEVEL? We might startAh, i forgot it. We can't assume that the host also support huge page for
at PT_PAGE_DIRECTORY_LEVEL but get 4k pages while iterating.
next gfn, as Marcelo's suggestion, we should "only map with level> 1 if
the host page matches the size".
Um, the problem is, when we get host page size, we should hold 'mm->mmap_sem',
it can't used in atomic context and it's also a slow path, we hope pte prefetch
path is fast.
How about only allow prefetch for sp.leve = 1 now? i'll improve it in the future,
i think it need more time :-)
But we can't assume the gfn's hva is consecutive, for example, gfn and gfn+1+Nice. Direct prefetch should usually succeed.
+ pfn = gfn_to_pfn_atomic(vcpu->kvm, gfn);
+ if (is_error_pfn(pfn)) {
+ kvm_release_pfn_clean(pfn);
+ break;
+ }
+ if (pte_prefetch_topup_memory_cache(vcpu))
+ break;
+
+ mmu_set_spte(vcpu, spte, ACC_ALL, ACC_ALL, 0, 0, 1, NULL,
+ sp->role.level, gfn, pfn, true, false);
+ }
+}
Can later augment to call get_users_pages_fast(..., PTE_PREFETCH_NUM,
...) to reduce gup overhead.
maybe in the different slots.
Do you mean that read all prefetched sptes at one time?+Why not kvm_read_guest_atomic()? Can do it outside the loop.
+ if (!table) {
+ page = gfn_to_page_atomic(vcpu->kvm, sp->gfn);
+ if (is_error_page(page)) {
+ kvm_release_page_clean(page);
+ break;
+ }
+ table = kmap_atomic(page, KM_USER0);
+ table = (pt_element_t *)((char *)table + offset);
+ }
If prefetch one spte fail, the later sptes that we read is waste, so i
choose read next spte only when current spte is prefetched successful.
But i not have strong opinion on it since it's fast to read all sptes at
one time, at the worst case, only 16 * 8 = 128 bytes we need to read.
I think lot of code can be shared with the pte prefetch in invlpg.Yes, please allow me to cleanup those code after my future patchset:
[PATCH v4 9/9] KVM MMU: optimize sync/update unsync-page
it's the last part in the 'allow multiple shadow pages' patchset.