Re: [PATCH 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
From: Juergen Gross
Date: Mon Aug 20 2018 - 10:56:03 EST
On 20/08/18 15:26, Thomas Gleixner wrote:
> On Mon, 20 Aug 2018, Juergen Gross wrote:
>> In case adding about 6 cycles for native_ptep_get_and_clear() is believed
>> to be too bad I can modify the patch to add a paravirt function for that
>> purpose in order to add the overhead for Xen guests only (in fact the
>> overhead for Xen guests will be less, as only one instruction writing to
>> the PTE has to be emulated by the hypervisor).
>
> I doubt that its worth the trouble of yet another paravirt thingy.
>
>> ---
>> arch/x86/include/asm/pgtable-3level.h | 14 ++++++++------
>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
>> index a564084c6141..7919ae4e27d8 100644
>> --- a/arch/x86/include/asm/pgtable-3level.h
>> +++ b/arch/x86/include/asm/pgtable-3level.h
>> @@ -2,6 +2,8 @@
>> #ifndef _ASM_X86_PGTABLE_3LEVEL_H
>> #define _ASM_X86_PGTABLE_3LEVEL_H
>>
>> +#include <asm/atomic64_32.h>
>> +
>> /*
>> * Intel Physical Address Extension (PAE) Mode - three-level page
>> * tables on PPro+ CPUs.
>> @@ -148,14 +150,14 @@ static inline void pud_clear(pud_t *pudp)
>> #ifdef CONFIG_SMP
>> static inline pte_t native_ptep_get_and_clear(pte_t *ptep)
>> {
>> - pte_t res;
>> + union {
>> + pte_t pte;
>> + long long val;
>> + } res;
>>
>> - /* xchg acts as a barrier before the setting of the high bits */
>> - res.pte_low = xchg(&ptep->pte_low, 0);
>> - res.pte_high = ptep->pte_high;
>> - ptep->pte_high = 0;
>> + res.val = arch_atomic64_xchg((atomic64_t *)ptep, 0);
>
> Couldn't you just keep
>
> pte_t res;
>
> and do:
>
> res.pte = (pteval_t) arch_atomic64_xchg((atomic64_t *)ptep, 0);
>
> Hmm?
Yes, got this suggestion already by Jan. I'm waiting with V2 until
tomorrow to see whether someone has other complaints.
Juergen