Re: [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush
From: Vitaly Kuznetsov
Date: Thu Aug 17 2017 - 03:58:34 EST
Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> writes:
> On 08/16/2017 12:42 PM, Vitaly Kuznetsov wrote:
>> Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> writes:
>>
>>> In case we decide to go HAVE_RCU_TABLE_FREE for all PARAVIRT-enabled
>>> kernels (as it seems to be the easiest/fastest way to fix Xen PV) - what
>>> do you think about the required testing? Any suggestion for a
>>> specifically crafted micro benchmark in addition to standard
>>> ebizzy/kernbench/...?
>> In the meantime I tested HAVE_RCU_TABLE_FREE with kernbench (enablement
>> patch I used is attached; I know that it breaks other architectures) on
>> bare metal with PARAVIRT enabled in config. The results are:
>>
>>...
>>
>> As you can see, there's no notable difference. I'll think of a
>> microbenchmark though.
>
> FWIW, I was about to send a very similar patch (but with only Xen-PV
> enabling RCU-based free by default) and saw similar results with
> kernbench, both Xen PV and baremetal.
>
Thanks for the confirmation,
I'd go with enabling it for PARAVIRT as we will need it for Hyper-V too.
<snip>
>>
>> #if CONFIG_PGTABLE_LEVELS > 4
>> void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
>> {
>> paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
>> +#ifdef CONFIG_HAVE_RCU_TABLE_FREE
>> + tlb_remove_table(tlb, virt_to_page(p4d));
>> +#else
>> tlb_remove_page(tlb, virt_to_page(p4d));
>> +#endif
>
> This can probably be factored out.
>
>> }
>> #endif /* CONFIG_PGTABLE_LEVELS > 4 */
>> #endif /* CONFIG_PGTABLE_LEVELS > 3 */
>> diff --git a/mm/memory.c b/mm/memory.c
>> index e158f7ac6730..18d6671b6ae2 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -329,6 +329,11 @@ bool __tlb_remove_page_size(struct mmu_gather *tlb, struct page *page, int page_
>> * See the comment near struct mmu_table_batch.
>> */
>>
>> +static void __tlb_remove_table(void *table)
>> +{
>> + free_page_and_swap_cache(table);
>> +}
>> +
>
> This needs to be a per-arch routine (e.g. see arch/arm64/include/asm/tlb.h).
>
Yea, this was a quick-and-dirty x86-only patch.
--
Vitaly