Re: [RFC PATCH v6 2/5] mm: migrate: Add migrate_misplaced_folios_batch()

From: Gregory Price

Date: Tue Apr 21 2026 - 12:05:18 EST

On Tue, Apr 21, 2026 at 08:55:02PM +0530, Donet Tom wrote:
>
> Hi Bharata
>
> On 3/23/26 3:21 PM, Bharata B Rao wrote:
> > From: Gregory Price <gourry@xxxxxxxxxx>
> >
> > Tiered memory systems often require migrating multiple folios at once.
> > Currently, migrate_misplaced_folio() handles only one folio per call,
> > which is inefficient for batch operations. This patch introduces
> > migrate_misplaced_folios_batch(), a batch variant that leverages
> > migrate_pages() internally for improved performance.
> >
> > The caller must isolate folios beforehand using
> > migrate_misplaced_folio_prepare(). On return, the folio list will be
> > empty regardless of success or failure.
> >
> > This function will be used by pghot kmigrated thread.
> >
> > Signed-off-by: Gregory Price <gourry@xxxxxxxxxx>
> > [Rewrote commit description]
> > Signed-off-by: Bharata B Rao <bharata@xxxxxxx>
> > ---
> > include/linux/migrate.h | 6 ++++++
> > mm/migrate.c | 48 +++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 54 insertions(+)
> >
> > diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> > index d5af2b7f577b..5c1e2691cec2 100644
> > --- a/include/linux/migrate.h
> > +++ b/include/linux/migrate.h
> > @@ -111,6 +111,7 @@ static inline void softleaf_entry_wait_on_locked(softleaf_t entry, spinlock_t *p
> > int migrate_misplaced_folio_prepare(struct folio *folio,
> > struct vm_area_struct *vma, int node);
> > int migrate_misplaced_folio(struct folio *folio, int node);
> > +int migrate_misplaced_folios_batch(struct list_head *folio_list, int node);
> > #else
> > static inline int migrate_misplaced_folio_prepare(struct folio *folio,
> > struct vm_area_struct *vma, int node)
> > @@ -121,6 +122,11 @@ static inline int migrate_misplaced_folio(struct folio *folio, int node)
> > {
> > return -EAGAIN; /* can't migrate now */
> > }
> > +static inline int migrate_misplaced_folios_batch(struct list_head *folio_list,
> > + int node)
> > +{
> > + return -EAGAIN; /* can't migrate now */
> > +}
> > #endif /* CONFIG_NUMA_BALANCING */
> > #ifdef CONFIG_MIGRATION
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index a15184950e65..94daec0f49ef 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -2751,5 +2751,53 @@ int migrate_misplaced_folio(struct folio *folio, int node)
> > BUG_ON(!list_empty(&migratepages));
> > return nr_remaining ? -EAGAIN : 0;
> > }
> > +
> > +/**
> > + * migrate_misplaced_folios_batch() - Batch variant of migrate_misplaced_folio
> > + * Attempts to migrate a folio list to the specified destination.
> > + * @folio_list: Isolated list of folios to be batch-migrated.
> > + * @node: The NUMA node ID to where the folios should be migrated.
> > + *
> > + * Caller is expected to have isolated the folios by calling
> > + * migrate_misplaced_folio_prepare(), which will result in an
> > + * elevated reference count on the folio. All the isolated folios
> > + * in the list must belong to the same memcg so that NUMA_PAGE_MIGRATE
> > + * stat can be attributed correctly to the memcg.
> > + *
> > + * This function will un-isolate the folios, drop the elevated reference
> > + * and remove them from the list before returning. This is called
> > + * only for batched promotion of hot pages from lower tier nodes.
> > + *
> > + * Return: 0 on success and -EAGAIN on failure or partial migration.
> > + * On return, @folio_list will be empty regardless of success/failure.
> > + */
> > +int migrate_misplaced_folios_batch(struct list_head *folio_list, int node)
> > +{
> > + pg_data_t *pgdat = NODE_DATA(node);
> > + struct mem_cgroup *memcg = NULL;
> > + unsigned int nr_succeeded = 0;
> > + int nr_remaining;
> > +
> > + if (!list_empty(folio_list)) {
> >
> We seem to proceed even when the list is empty. Should we instead return
> early in that case?
>

Well that seems utterly reasonable, yes you are right.

> > + struct folio *first = list_first_entry(folio_list, struct folio, lru);
> > + memcg = get_mem_cgroup_from_folio(first);
>
>
> I had a small question—are we ensuring that a single list contains folios
> from the same memcg?
>

It has been a long while since i originally wrote this commit.

I believe I originally wrote this I used it in the context of
folio_mark_accessed() driven promotions - trying to get some semblance
of NUMA balancing for unmapped page cache pages.

These folios got put into a task workqueue that then got processed on
the way out of the kernel.

I think I made the assumption at the time that the folios would all
belong to the same memcg - I have since learned that this almost
certainly is not the case.

That means a bulk migration may have to first process the folios into
lists by memcg before migrating them.

So this commit likely needs to be redone.

~Gregory