Re: [RFC 1/4] arm64/mm: Add SW and HW dirty state helpers
From: Anshuman Khandual
Date: Sun Jul 09 2023 - 22:54:18 EST
On 7/7/23 17:39, David Hildenbrand wrote:
> On 07.07.23 07:33, Anshuman Khandual wrote:
>> This factors out low level SW and HW state changes i.e make and clear into
>> separate helpers making them explicit improving readability. This also adds
>> pte_rdonly() helper as well. No functional change is intended.
>>
>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
>> Cc: Will Deacon <will@xxxxxxxxxx>
>> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx>
>> ---
>> arch/arm64/include/asm/pgtable.h | 52 ++++++++++++++++++++++++++------
>> 1 file changed, 42 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index 0bd18de9fd97..fb03be697819 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -103,6 +103,7 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
>> #define pte_young(pte) (!!(pte_val(pte) & PTE_AF))
>> #define pte_special(pte) (!!(pte_val(pte) & PTE_SPECIAL))
>> #define pte_write(pte) (!!(pte_val(pte) & PTE_WRITE))
>> +#define pte_rdonly(pte) (!!(pte_val(pte) & PTE_RDONLY))
>> #define pte_user(pte) (!!(pte_val(pte) & PTE_USER))
>> #define pte_user_exec(pte) (!(pte_val(pte) & PTE_UXN))
>> #define pte_cont(pte) (!!(pte_val(pte) & PTE_CONT))
>> @@ -120,7 +121,7 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
>> (__boundary - 1 < (end) - 1) ? __boundary : (end); \
>> })
>> -#define pte_hw_dirty(pte) (pte_write(pte) && !(pte_val(pte) & PTE_RDONLY))
>> +#define pte_hw_dirty(pte) (pte_write(pte) && !pte_rdonly(pte))
>> #define pte_sw_dirty(pte) (!!(pte_val(pte) & PTE_DIRTY))
>> #define pte_dirty(pte) (pte_sw_dirty(pte) || pte_hw_dirty(pte))
>> @@ -174,6 +175,39 @@ static inline pmd_t clear_pmd_bit(pmd_t pmd, pgprot_t prot)
>> return pmd;
>> }
>> +static inline pte_t pte_hw_mkdirty(pte_t pte)
>
> I'd have called this "pte_mkhw_dirty", similar to "pte_mksoft_dirty".
>
>> +{
>> + if (pte_write(pte))
>> + pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY));
>> +
>> + return pte;
>> +}
>> +
>> +static inline pte_t pte_sw_mkdirty(pte_t pte)
>
> pte_mksw_dirty
Sure, will change them as pte_mkhw_dirty()/pte_mksw_dirty() instead.
>
>> +{
>> + return set_pte_bit(pte, __pgprot(PTE_DIRTY));
>> +}
>> +
>> +static inline __always_unused pte_t pte_hw_clr_dirty(pte_t pte)
>
> pte_clear_hw_dirty (again, similar to pte_clear_soft_dirty )
>
>> +{
>> + return set_pte_bit(pte, __pgprot(PTE_RDONLY));
>> +}
>> +
>> +static inline pte_t pte_sw_clr_dirty(pte_t pte)
>
> pte_clear_sw_dirty
Sure, will change them as pte_clear_hw_dirty()/pte_clear_sw_dirty() instead.
>
>> +{
>> + pte = clear_pte_bit(pte, __pgprot(PTE_DIRTY));
>> +
>> + /*
>> + * Clearing the software dirty state requires clearing
>> + * the PTE_DIRTY bit along with setting the PTE_RDONLY
>> + * ensuring a page fault on subsequent write access.
>> + *
>> + * NOTE: Setting the PTE_RDONLY (as a coincident) also
>> + * implies clearing the HW dirty state.
>> + */
>> + return set_pte_bit(pte, __pgprot(PTE_RDONLY));
>> +}
>> +
>> static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot)
>> {
>> pmd_val(pmd) |= pgprot_val(prot);
>> @@ -189,19 +223,17 @@ static inline pte_t pte_mkwrite(pte_t pte)
>> static inline pte_t pte_mkclean(pte_t pte)
>> {
>> - pte = clear_pte_bit(pte, __pgprot(PTE_DIRTY));
>> - pte = set_pte_bit(pte, __pgprot(PTE_RDONLY));
>> -
>> - return pte;
>> + /*
>> + * Subsequent call to pte_hw_clr_dirty() is not required
>> + * because pte_sw_clr_dirty() in turn does that as well.
>> + */
>> + return pte_sw_clr_dirty(pte);
>
> Hm, I'm not sure if that simplifies things.
>
> You call pte_sw_clr_dirty() and suddenly your hw dirty bit is clear?
Because clearing HW dirty bit just needs setting PTE_RDONLY bit, which as
a coincidence is also required, after clearing the SW dirty bit to enable
a subsequent write fault. Here pte_sw_clr_dirty() just happen to contain
pte_hw_clr_dirty().
>
> In that case I think the current implementation is clearer: it doesn't provide primitives that don't make any sense.
It actually does a SW dirty bit clearing which also takes care of HW dirty
bit clearing without saying so explicitly. These new helpers demonstrate
bit clearly what is happening.
>
>> }
>> static inline pte_t pte_mkdirty(pte_t pte)
>> {
>> - pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));
>> -
>> - if (pte_write(pte))
>> - pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY));
>> -
>> + pte = pte_sw_mkdirty(pte);
>> + pte = pte_hw_mkdirty(pte);
>
> That looks weird. Especially, pte_hw_mkdirty() only does something if pte_write().
pte_write() check asserts if DBM is implemented and being used before clearing
PTE_RDONLY making it a HW dirty state. If pte_write() is cleared, either DBM
is not implemented or it's a non-writable entry, either way dirty state cannot
be tracked in HW.
>
> Shouldn't pte_hw_mkdirty() bail out if it cannot do anything reasonable (IOW, !writable)?
static inline pte_t pte_hw_mkdirty(pte_t pte)
{
if (pte_write(pte))
pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY));
return pte;
}
If pte_write() is not positive, it's in !writable state on DBM enabled systems.
Otherwise pte_write() state does not matter, as the bit position does not make
sense on non DBM enabled systems.
>
>> return pte;
>> }
>>
>