Re: [PATCH 27/31] mm/khugepaged: allow pte_offset_map[_lock]() to fail
From: Hugh Dickins
Date: Wed May 24 2023 - 00:45:02 EST
On Mon, 22 May 2023, Yang Shi wrote:
> On Sun, May 21, 2023 at 10:24 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> >
> > __collapse_huge_page_swapin(): don't drop the map after every pte, it
> > only has to be dropped by do_swap_page(); give up if pte_offset_map()
> > fails; trace_mm_collapse_huge_page_swapin() at the end, with result;
> > fix comment on returned result; fix vmf.pgoff, though it's not used.
> >
> > collapse_huge_page(): use pte_offset_map_lock() on the _pmd returned
> > from clearing; allow failure, but it should be impossible there.
> > hpage_collapse_scan_pmd() and collapse_pte_mapped_thp() allow for
> > pte_offset_map_lock() failure.
> >
> > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
>
> Reviewed-by: Yang Shi <shy828301@xxxxxxxxx>
Thanks.
>
> A nit below:
>
> > ---
> > mm/khugepaged.c | 72 +++++++++++++++++++++++++++++++++----------------
> > 1 file changed, 49 insertions(+), 23 deletions(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 732f9ac393fc..49cfa7cdfe93 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
...
> > @@ -1029,24 +1040,29 @@ static int __collapse_huge_page_swapin(struct mm_struct *mm,
> > * resulting in later failure.
> > */
> > if (ret & VM_FAULT_RETRY) {
> > - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
> > /* Likely, but not guaranteed, that page lock failed */
> > - return SCAN_PAGE_LOCK;
> > + result = SCAN_PAGE_LOCK;
>
> With per-VMA lock, this may not be true anymore, at least not true
> until per-VMA lock supports swap fault. It may be better to have a
> more general failure code, for example, SCAN_FAIL. But anyway you
> don't have to change it in your patch, I can send a follow-up patch
> once this series is landed on mm-unstable.
Interesting point (I've not tried to wrap my head around what differences
per-VMA locking would make to old likelihoods here), and thank you for
deferring a change on it - appreciated.
Something to beware of, if you do choose to change it: mostly those
SCAN codes (I'm not a fan of them!) are only for a tracepoint somewhere,
but madvise_collapse() and madvise_collapse_errno() take some of them
more seriously than others - I think SCAN_PAGE_LOCK ends up as an
EAGAIN (rightly), but SCAN_FAIL as an EINVAL (depends).
But maybe there are layers in between which do not propagate the result
code, I didn't check. All in all, not something I'd spend time on myself.
Hugh