Re: [PATCH v2] mm/page_io: fix PSWPIN undercount for large folios in sio_read_complete()

From: Barry Song

Date: Wed Apr 01 2026 - 16:25:10 EST


On Wed, Apr 1, 2026 at 3:10 PM David CARLIER <devnexen@xxxxxxxxx> wrote:
>
> On Tue, 31 Mar 2026 at 23:33, Barry Song <21cnbao@xxxxxxxxx> wrote:
> >
> > On Mon, Mar 30, 2026 at 3:12 PM David Carlier <devnexen@xxxxxxxxx> wrote:
> > >
> > > sio_read_complete() uses sio->pages to account global PSWPIN vm events,
> > > but sio->pages tracks the number of bvec entries (folios), not base
> > > pages. For large folios this undercounts compared to the per-memcg path
> > > which correctly uses folio_nr_pages(), and compared to the bdev read
> > > paths which also use folio_nr_pages().
> > >
> > > Use sio->len >> PAGE_SHIFT instead, which gives the correct base page
> > > count since sio->len is accumulated via folio_size(folio).
> > >
> > > Fixes: a1a0dfd56f97 ("mm: handle THP in swap_*page_fs()")
> > > Signed-off-by: David Carlier <devnexen@xxxxxxxxx>
> >
> > The patch seems theoretically correct, but I’m wondering
> > where we can swap in mTHP for filesystem-based swap?
> >
> > In both do_swap_page() and shmem_swapin_folio(), we check
> > data_race(si->flags & SWP_SYNCHRONOUS_IO) before allocating
> > large folios. Am I missing something?
>
> ▎ The patch seems theoretically correct, but I'm wondering
> ▎ where we can swap in mTHP for filesystem-based swap?
>
> ▎ In both do_swap_page() and shmem_swapin_folio(), we check
> ▎ data_race(si->flags & SWP_SYNCHRONOUS_IO) before allocating
> ▎ large folios. Am I missing something?
>
> You're right, I missed that. SWP_FS_OPS is only set by NFS and
> SMB which have no bdev, so SWP_SYNCHRONOUS_IO can never be set
> alongside it. Large folios can't currently reach this path since
> both do_swap_page() and shmem_swapin_folio() gate mTHP allocation
> on SWP_SYNCHRONOUS_IO.
>
> That said, sio_read_complete() already calls count_mthp_stat()
> and the per-memcg accounting uses folio_nr_pages(), so the code
> seems written with large folios in mind even if the path is
> currently unreachable. Using sio->pages (bvec entry count) for
> a base-page count is still semantically wrong, but I understand
> the practical impact is nil today.
>
> Happy to either drop this or keep it as a correctness cleanup,
> whatever you and Andrew prefer.
>

The patch looks correct, so I’d personally support keeping it
after cleaning up the description, as David suggested.

Best Regards
Barry