Re: [PATCH v2 0/2] mm: swap: mTHP swap allocator base on swap cluster order

From: Chris Li
Date: Fri Jun 14 2024 - 22:51:29 EST


On Fri, Jun 14, 2024 at 6:06 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, 14 Jun 2024 16:48:06 -0700 Chris Li <chrisl@xxxxxxxxxx> wrote:
>
> > This is the short term solutiolns "swap cluster order" listed
> > in my "Swap Abstraction" discussion slice 8 in the recent
> > LSF/MM conference.
> >
> > When commit 845982eb264bc "mm: swap: allow storage of all mTHP
> > orders" is introduced, it only allocates the mTHP swap entries
> > from new empty cluster list. It has a fragmentation issue
> > reported by Barry.
> >
> > https://lore.kernel.org/all/CAGsJ_4zAcJkuW016Cfi6wicRr8N9X+GJJhgMQdSMp+Ah+NSgNQ@xxxxxxxxxxxxxx/
> >
> > The mTHP allocation failure rate raises to almost 100% after a few
> > hours in Barry's test run.
> >
> > The reason is that all the empty cluster has been exhausted while
> > there are planty of free swap entries to in the cluster that is
> > not 100% free.
> >
> > Remember the swap allocation order in the cluster.
> > Keep track of the per order non full cluster list for later allocation.
> >
> > This greatly improve the sucess rate of the mTHP swap allocation.
> >
>
> I'm having trouble understanding the overall impact of this on users.
> We fail the mTHP swap allocation and fall back, but things continue to
> operate OK?

Continue to operate OK in the sense that the mTHP will have to split
into 4K pages before the swap out, aka the fall back. The swap out and
swap in can continue to work as 4K pages, not as the mTHP. Due to the
fallback, the mTHP based zsmalloc compression with 64K buffer will not
happen. That is the effect of the fallback. But mTHP swap out and swap
in is relatively new, it is not really a regression.

>
> > There is some test number in the V1 thread of this series:
> > https://lore.kernel.org/r/20240524-swap-allocator-v1-0-47861b423b26@xxxxxxxxxx
>
> Well, please let's get the latest numbers into the latest patchset.
> Along with a higher-level (and quantitative) description of the user impact.

I will need Barray's help to collect the number. I don't have the
setup to reproduce his test result.
Maybe a follow up commit message amendment for the test number when I get it?

>
> I'll add this into mm-unstable now for some exposure, but at this point
> I'm not able to determine whether it should go in as a hotfix for
> 6.10-rcX.

Maybe not need to be a hotfix. Not all Barry's mTHP swap out and swap
in patch got merged yet.

Chris