Re: [RFC PATCH 2/4] mm: add PTE_MARKER_GUARD PTE marker

From: Lorenzo Stoakes
Date: Mon Oct 14 2024 - 06:24:21 EST


On Fri, Oct 11, 2024 at 08:11:32PM +0200, Jann Horn wrote:
> On Fri, Sep 27, 2024 at 2:51 PM Lorenzo Stoakes
> <lorenzo.stoakes@xxxxxxxxxx> wrote:
> > Add a new PTE marker that results in any access causing the accessing
> > process to segfault.
> [...]
> > static inline int is_poisoned_swp_entry(swp_entry_t entry)
> > +{
> > + /*
> > + * We treat guard pages as poisoned too as these have the same semantics
> > + * as poisoned ranges, only with different fault handling.
> > + */
> > + return is_pte_marker_entry(entry) &&
> > + (pte_marker_get(entry) &
> > + (PTE_MARKER_POISONED | PTE_MARKER_GUARD));
> > +}
>
> This means MADV_FREE will also clear guard PTEs, right?

Yes, this is expected, it acts like unmap in effect (with a delayed
effect), so we give it the same semantics. The same thing happens with
hardware poisoning.

You can see in the tests what expectations we have with different
operations, we assert there this specific behaviour:

/* Lazyfree range. */
ASSERT_EQ(madvise(ptr, 10 * page_size, MADV_FREE), 0);

/* This should simply clear the poison markers. */
for (i = 0; i < 10; i++) {
ASSERT_TRUE(try_read_write_buf(&ptr[i * page_size]));
}

The tests somewhat self-document expected behaviour.

>
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 5c6486e33e63..6c413c3d72fd 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1457,7 +1457,7 @@ static inline bool should_zap_folio(struct zap_details *details,
> > return !folio_test_anon(folio);
> > }
> >
> > -static inline bool zap_drop_file_uffd_wp(struct zap_details *details)
> > +static inline bool zap_drop_markers(struct zap_details *details)
> > {
> > if (!details)
> > return false;
> > @@ -1478,7 +1478,7 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma,
> > if (vma_is_anonymous(vma))
> > return;
> >
> > - if (zap_drop_file_uffd_wp(details))
> > + if (zap_drop_markers(details))
> > return;
> >
> > for (;;) {
> > @@ -1673,7 +1673,15 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
> > * drop the marker if explicitly requested.
> > */
> > if (!vma_is_anonymous(vma) &&
> > - !zap_drop_file_uffd_wp(details))
> > + !zap_drop_markers(details))
> > + continue;
> > + } else if (is_guard_swp_entry(entry)) {
> > + /*
> > + * Ordinary zapping should not remove guard PTE
> > + * markers. Only do so if we should remove PTE markers
> > + * in general.
> > + */
> > + if (!zap_drop_markers(details))
> > continue;
>
> Just a comment: It's nice that the feature is restricted to anonymous
> VMAs, otherwise we'd have to figure out here what to do about
> unmap_mapping_folio() (which sets ZAP_FLAG_DROP_MARKER together with
> details.single_folio)...

Yes this is not the only issue with file-backed mappings. Readahead being
another, and plenty more.

We will probably look at how we might do this once this patch set lands,
and tackle all of these fun things then...

>
>
> > } else if (is_hwpoison_entry(entry) ||
> > is_poisoned_swp_entry(entry)) {
> > @@ -4005,6 +4013,10 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf)
> > if (marker & PTE_MARKER_POISONED)
> > return VM_FAULT_HWPOISON;
> >
> > + /* Hitting a guard page is always a fatal condition. */
> > + if (marker & PTE_MARKER_GUARD)
> > + return VM_FAULT_SIGSEGV;
> > +
> > if (pte_marker_entry_uffd_wp(entry))
> > return pte_marker_handle_uffd_wp(vmf);
> >
> > --
> > 2.46.2
> >