Re: [PATCH v2] mm/page_io: fix PSWPIN undercount for large folios in sio_read_complete()
From: David CARLIER
Date: Wed Apr 01 2026 - 03:18:46 EST
On Tue, 31 Mar 2026 at 23:33, Barry Song <21cnbao@xxxxxxxxx> wrote:
>
> On Mon, Mar 30, 2026 at 3:12 PM David Carlier <devnexen@xxxxxxxxx> wrote:
> >
> > sio_read_complete() uses sio->pages to account global PSWPIN vm events,
> > but sio->pages tracks the number of bvec entries (folios), not base
> > pages. For large folios this undercounts compared to the per-memcg path
> > which correctly uses folio_nr_pages(), and compared to the bdev read
> > paths which also use folio_nr_pages().
> >
> > Use sio->len >> PAGE_SHIFT instead, which gives the correct base page
> > count since sio->len is accumulated via folio_size(folio).
> >
> > Fixes: a1a0dfd56f97 ("mm: handle THP in swap_*page_fs()")
> > Signed-off-by: David Carlier <devnexen@xxxxxxxxx>
>
> The patch seems theoretically correct, but I’m wondering
> where we can swap in mTHP for filesystem-based swap?
>
> In both do_swap_page() and shmem_swapin_folio(), we check
> data_race(si->flags & SWP_SYNCHRONOUS_IO) before allocating
> large folios. Am I missing something?
▎ The patch seems theoretically correct, but I'm wondering
▎ where we can swap in mTHP for filesystem-based swap?
▎ In both do_swap_page() and shmem_swapin_folio(), we check
▎ data_race(si->flags & SWP_SYNCHRONOUS_IO) before allocating
▎ large folios. Am I missing something?
You're right, I missed that. SWP_FS_OPS is only set by NFS and
SMB which have no bdev, so SWP_SYNCHRONOUS_IO can never be set
alongside it. Large folios can't currently reach this path since
both do_swap_page() and shmem_swapin_folio() gate mTHP allocation
on SWP_SYNCHRONOUS_IO.
That said, sio_read_complete() already calls count_mthp_stat()
and the per-memcg accounting uses folio_nr_pages(), so the code
seems written with large folios in mind even if the path is
currently unreachable. Using sio->pages (bvec entry count) for
a base-page count is still semantically wrong, but I understand
the practical impact is nil today.
Happy to either drop this or keep it as a correctness cleanup,
whatever you and Andrew prefer.
Cheers !
>
> > ---
> > mm/page_io.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/page_io.c b/mm/page_io.c
> > index 63b262f4c5a9..1389cd57ca88 100644
> > --- a/mm/page_io.c
> > +++ b/mm/page_io.c
> > @@ -497,7 +497,7 @@ static void sio_read_complete(struct kiocb *iocb, long ret)
> > folio_mark_uptodate(folio);
> > folio_unlock(folio);
> > }
> > - count_vm_events(PSWPIN, sio->pages);
> > + count_vm_events(PSWPIN, sio->len >> PAGE_SHIFT);
> > } else {
> > for (p = 0; p < sio->pages; p++) {
> > struct folio *folio = page_folio(sio->bvec[p].bv_page);
> > --
> > 2.53.0
> >
>
> Thanks
> Barry