Re: [PATCH 1/1] mm/madvise: enhance lazyfreeing with mTHP in madvise_free

From: Lance Yang
Date: Mon Feb 26 2024 - 21:16:08 EST


Thanks, Barry!

On Tue, Feb 27, 2024 at 10:12 AM Barry Song <21cnbao@xxxxxxxxx> wrote:
>
> On Tue, Feb 27, 2024 at 2:48 PM Lance Yang <ioworker0@xxxxxxxxx> wrote:
> >
> > On Tue, Feb 27, 2024 at 9:21 AM Barry Song <21cnbao@xxxxxxxxx> wrote:
> > >
> > > > Thanks for your suggestion. I'll use folio_pte_batch() in v2.
> > >
> > > Hi Lance,
> > > Obviously, we both need this. While making large folio swap-in
> > > v2, I am exporting folio_pte_batch() as below,
> >
> > Thanks, Barry.
> >
> > Could you separate the export of folio_pte_batch() from the large folio
> > swap-in v2? Prioritizing the push for this specific change would aid in
> > the development of the v2 based on it.
>
> I agree we should make this one pulled in by Andrew early to avoid potential
> dependencies and conflicts in two jobs.
>
> >
> > Best,
> > Lance
> >
> > >
> > > From: Barry Song <v-songbaohua@xxxxxxxx>
> > > Date: Tue, 27 Feb 2024 14:05:43 +1300
> > > Subject: [PATCH] mm: export folio_pte_batch as a couple of modules need it
> > >
> > > MADV_FREE, MADV_PAGEOUT and some other modules might need folio_pte_batch
> > > to check if a range of PTEs are completely mapped to a large folio with
> > > contiguous physcial offset.
> > >
> > > Cc: Lance Yang <ioworker0@xxxxxxxxx>
> > > Cc: Ryan Roberts <ryan.roberts@xxxxxxx>
> > > Cc: David Hildenbrand <david@xxxxxxxxxx>
> > > Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx>
> > > ---
> > > mm/internal.h | 13 +++++++++++++
> > > mm/memory.c | 2 +-
> > > 2 files changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/internal.h b/mm/internal.h
> > > index 36c11ea41f47..7e11aea3eda9 100644
> > > --- a/mm/internal.h
> > > +++ b/mm/internal.h
> > > @@ -83,6 +83,19 @@ static inline void *folio_raw_mapping(struct folio *folio)
> > > return (void *)(mapping & ~PAGE_MAPPING_FLAGS);
> > > }
> > >
> > > +/* Flags for folio_pte_batch(). */
> > > +typedef int __bitwise fpb_t;
> > > +
> > > +/* Compare PTEs after pte_mkclean(), ignoring the dirty bit. */
> > > +#define FPB_IGNORE_DIRTY ((__force fpb_t)BIT(0))
> > > +
> > > +/* Compare PTEs after pte_clear_soft_dirty(), ignoring the soft-dirty bit. */
> > > +#define FPB_IGNORE_SOFT_DIRTY ((__force fpb_t)BIT(1))
> > > +
> > > +extern int folio_pte_batch(struct folio *folio, unsigned long addr,
> > > + pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags,
> > > + bool *any_writable);
> > > +
> > > void __acct_reclaim_writeback(pg_data_t *pgdat, struct folio *folio,
> > > int nr_throttled);
> > > static inline void acct_reclaim_writeback(struct folio *folio)
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 6378f6bc22c5..dd9bd67f037a 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -989,7 +989,7 @@ static inline pte_t __pte_batch_clear_ignored(pte_t pte, fpb_t flags)
> > > * If "any_writable" is set, it will indicate if any other PTE besides the
> > > * first (given) PTE is writable.
> > > */
> > > -static inline int folio_pte_batch(struct folio *folio, unsigned long addr,
> > > +int folio_pte_batch(struct folio *folio, unsigned long addr,
> > > pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags,
> > > bool *any_writable)
> > > {
> > > --
> > > 2.34.1
> > >
> > > > Best,
> > > > Lance
> > >
> Thanks
> Barry
> > >