Re: [PATCH 4/7] mm/migrate: add batch-copy path in migrate_pages_batch

From: Garg, Shivank

Date: Thu May 21 2026 - 09:39:49 EST


On 4/28/2026 9:20 PM, Shivank Garg wrote:

Sashiko review comment:
https://sashiko.dev/#/patchset/20260428155043.39251-2-shivankg%40amd.com

[...]

> + rc = folios_mc_copy(&dst_batch, &unmap_batch, nr_batch);
> + batch_copied = (rc == 0);
> + }
> +
> retry = 1;
> for (pass = 0; pass < nr_pass && retry; pass++) {
> retry = 0;
> thp_retry = 0;
> nr_retry_pages = 0;
>
> - /* Move the unmapped folios */
> - migrate_folios_move(&unmap_folios, &dst_folios,
> - put_new_folio, private, mode, reason,
> - ret_folios, stats, &retry, &thp_retry,
> - &nr_failed, &nr_retry_pages, false);
> + if (!list_empty(&unmap_batch))
> + migrate_folios_move(&unmap_batch, &dst_batch, put_new_folio,
> + private, mode, reason, ret_folios, stats,
> + &retry, &thp_retry, &nr_failed,
> + &nr_retry_pages, batch_copied);
> + if (!list_empty(&unmap_single))
> + migrate_folios_move(&unmap_single, &dst_single, put_new_folio,
> + private, mode, reason, ret_folios, stats,
> + &retry, &thp_retry, &nr_failed,
> + &nr_retry_pages, false);
> }


Sashiko:
Could reusing the batch_copied flag on retry cause data loss?
The early batch copy captures the data before the exclusivity check
(folio_ref_count() == expected_count) in the move phase. If a folio is
pinned by DMA, the early copy might capture stale or partial data while the
move phase correctly detects the elevated refcount and returns -EAGAIN.
During the retry loop, migrate_folios_move() is called again with the
original batch_copied value (true). Once the DMA pin is dropped, the move
phase succeeds but skips the data copy, leaving the destination folio with
the stale data from the first pass.
--

IIUC, This is real race and original code avoids this by placing the copy
after the refcount check, so by the time we copy, no extra ref are held by
writer.

To fix this, I plan to avoid DMA pinned folios from batch path, by adding
refcount check when creating the unmap_batch list.

if (... page_supports_batch_copy(folio) &&
folio_ref_count(folio) == folio_expected_ref_count(folio) + 1) {
/* add to dst_batch/src_batch list */
} else {
/* dst_single/unmap_single list */
}

[...]

> +/**
> + * folios_mc_copy - Copy the contents of list of folios.
> + * @dst_list: destination folio list.
> + * @src_list: source folio list.
> + * @folios_cnt: unused here, present for callback signature compatibility.
> + *
> + * Walks list of src and dst folios in lockstep and copies folio
> + * content via folio_mc_copy(). The caller must ensure both lists have
> + * the same number of entries. This may sleep.
> + *
> + * Return: 0 on success, negative errno on failure.
> + */
> +int folios_mc_copy(struct list_head *dst_list, struct list_head *src_list,
> + unsigned int __always_unused folios_cnt)
> +{
> + struct folio *src, *dst;
> + int ret;
> +
> + dst = list_first_entry(dst_list, struct folio, lru);
> + list_for_each_entry(src, src_list, lru) {
> + ret = folio_mc_copy(dst, src);
> + if (ret)
> + return ret;
> + dst = list_next_entry(dst, lru);
> + }
> +
> + return 0;
> +}

Sashiko:
Should this loop include a cond_resched()?
The original folio migration loop in migrate_folios_move() explicitly calls
cond_resched() on every iteration. While folio_mc_copy() calls
cond_resched() internally, it only does so between subpages of a large
folio:
mm/util.c:folio_mc_copy() {
...
if (++i == nr)
break;
cond_resched();
}
For order-0 folios (nr == 1), the loop breaks before reaching the reschedule
point. Processing a large batch of order-0 folios here will hold the CPU in
an uninterrupted tight loop without yielding.
--

This concern for large batch of order-0 folios looks valid. But I think the
current direction is to remove cond_resched() eventually for lazy preemption.
So should I take Sashiko's suggestion to add cond_resched()?


Thanks,
Shivank