Re: [PATCH 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear

From: Thomas Gleixner
Date: Mon Aug 20 2018 - 09:26:13 EST


On Mon, 20 Aug 2018, Juergen Gross wrote:
> In case adding about 6 cycles for native_ptep_get_and_clear() is believed
> to be too bad I can modify the patch to add a paravirt function for that
> purpose in order to add the overhead for Xen guests only (in fact the
> overhead for Xen guests will be less, as only one instruction writing to
> the PTE has to be emulated by the hypervisor).

I doubt that its worth the trouble of yet another paravirt thingy.

> ---
> arch/x86/include/asm/pgtable-3level.h | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
> index a564084c6141..7919ae4e27d8 100644
> --- a/arch/x86/include/asm/pgtable-3level.h
> +++ b/arch/x86/include/asm/pgtable-3level.h
> @@ -2,6 +2,8 @@
> #ifndef _ASM_X86_PGTABLE_3LEVEL_H
> #define _ASM_X86_PGTABLE_3LEVEL_H
>
> +#include <asm/atomic64_32.h>
> +
> /*
> * Intel Physical Address Extension (PAE) Mode - three-level page
> * tables on PPro+ CPUs.
> @@ -148,14 +150,14 @@ static inline void pud_clear(pud_t *pudp)
> #ifdef CONFIG_SMP
> static inline pte_t native_ptep_get_and_clear(pte_t *ptep)
> {
> - pte_t res;
> + union {
> + pte_t pte;
> + long long val;
> + } res;
>
> - /* xchg acts as a barrier before the setting of the high bits */
> - res.pte_low = xchg(&ptep->pte_low, 0);
> - res.pte_high = ptep->pte_high;
> - ptep->pte_high = 0;
> + res.val = arch_atomic64_xchg((atomic64_t *)ptep, 0);

Couldn't you just keep

pte_t res;

and do:

res.pte = (pteval_t) arch_atomic64_xchg((atomic64_t *)ptep, 0);

Hmm?

Thanks,

tglx