Re: [PATCH v2 10/39] x86/mm: Introduce _PAGE_COW
From: Kirill A . Shutemov
Date: Mon Oct 03 2022 - 12:26:31 EST
On Thu, Sep 29, 2022 at 03:29:07PM -0700, Rick Edgecombe wrote:
> +/*
> + * Normally the Dirty bit is used to denote COW memory on x86. But
> + * in the case of X86_FEATURE_SHSTK, the software COW bit is used,
> + * since the Dirty=1,Write=0 will result in the memory being treated
> + * as shaodw stack by the HW. So when creating COW memory, a software
> + * bit is used _PAGE_BIT_COW. The following functions pte_mkcow() and
> + * pte_clear_cow() take a PTE marked conventially COW (Dirty=1) and
> + * transition it to the shadow stack compatible version of COW (Cow=1).
> + */
> +
> +static inline pte_t pte_mkcow(pte_t pte)
> +{
> + if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
> + return pte;
> +
> + pte = pte_clear_flags(pte, _PAGE_DIRTY);
> + return pte_set_flags(pte, _PAGE_COW);
> +}
> +
> +static inline pte_t pte_clear_cow(pte_t pte)
> +{
> + /*
> + * _PAGE_COW is unnecessary on !X86_FEATURE_SHSTK kernels.
> + * See the _PAGE_COW definition for more details.
> + */
> + if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
> + return pte;
> +
> + /*
> + * PTE is getting copied-on-write, so it will be dirtied
> + * if writable, or made shadow stack if shadow stack and
> + * being copied on access. Set they dirty bit for both
> + * cases.
> + */
> + pte = pte_set_flags(pte, _PAGE_DIRTY);
> + return pte_clear_flags(pte, _PAGE_COW);
> +}
These X86_FEATURE_SHSTK checks make me uneasy. Maybe use the _PAGE_COW
logic for all machines with 64-bit entries. It will get you much more
coverage and more universal rules.
> +
> #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
> static inline int pte_uffd_wp(pte_t pte)
> {
> @@ -319,7 +381,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte)
>
> static inline pte_t pte_mkclean(pte_t pte)
> {
> - return pte_clear_flags(pte, _PAGE_DIRTY);
> + return pte_clear_flags(pte, _PAGE_DIRTY_BITS);
> }
>
> static inline pte_t pte_mkold(pte_t pte)
> @@ -329,7 +391,16 @@ static inline pte_t pte_mkold(pte_t pte)
>
> static inline pte_t pte_wrprotect(pte_t pte)
> {
> - return pte_clear_flags(pte, _PAGE_RW);
> + pte = pte_clear_flags(pte, _PAGE_RW);
> +
> + /*
> + * Blindly clearing _PAGE_RW might accidentally create
> + * a shadow stack PTE (Write=0,Dirty=1). Move the hardware
> + * dirty value to the software bit.
> + */
> + if (pte_dirty(pte))
> + pte = pte_mkcow(pte);
> + return pte;
> }
Hm. What about ptep/pmdp_set_wrprotect()? They clear _PAGE_RW blindly.
--
Kiryl Shutsemau / Kirill A. Shutemov