Re: [PATCH v5 00/21] Virtual Swap Space

From: Kairui Song

Date: Tue Apr 21 2026 - 22:29:46 EST


On Wed, Apr 22, 2026 at 8:26 AM Yosry Ahmed <yosry@xxxxxxxxxx> wrote:
>
> On Fri, Mar 20, 2026 at 12:27:14PM -0700, Nhat Pham wrote:
> >
> > This patch series implements the virtual swap space idea, based on Yosry's
> > proposals at LSFMMBPF 2023 (see [1], [2], [3]), as well as valuable
> > inputs from Johannes Weiner. The same idea (with different
> > implementation details) has been floated by Rik van Riel since at least
> > 2011 (see [8]).
>
> Unfortuantely, I haven't been able to keep up with virtual swap and swap
> table development, as my time is mostly being spent elsewhere these
> days. I do have a question tho, which might have already been answered
> or is too naive/stupid -- so apologies in advance.

Hi Yosry,

Not a stupid question at all—it's actually spot on. :)

>
> Given the recent advancements in the swap table and that most metadata
> and the swap cache are already being pulled into it, is it possible to
> use the swap table in the virtual swap layer instead of the xarray?
>
> Basically pull the swap table one layer higher, and have it point to
> either a zswap entry or a physical swap slot (or others in the future)?
> If my understanding is correct, we kinda get the best of both worlds and
> reuse the integration already done by the swap table with the swap
> cache, as well as the lock paritioning.
>
> In this world, the clusters would be in the virtual swap space, and we'd
> create the clusters on-demand as needed.
>
> Does this even work or make the least amount of sense (I guess the
> question is for both Nhat and Kairui)?
>

Yes, this absolutely works. In fact, I previously posted a working RFC
based on this idea. In that series, clusters are dynamically
allocated, allowing the swap space to be dynamically sized
(essentially infinite) while reusing all the existing infrastructure:
https://lore.kernel.org/all/20260220-swap-table-p4-v1-0-104795d19815@xxxxxxxxxxx/

The only missing pieces are a few helpers like folio_realloc_swap()
and folio_migrate_swap() for lower layer allocation and migration. I
prototyped this locally and it wasn't difficult to implement.
Furthermore, this approach works perfectly with YoungJun's tiering
work with zero conflicts, the dynamic layer can be runtime or
per-memcg optional.

To move this forward, I've stripped out the RFC features and memcg
behavior changes, and recently sent a V3 that focuses purely on the
infrastructure. It introduces no behavior changes or new features,
just optimizations.

It cleans up a lot of allocation and ordering, as well as memcg
swap lookups. Since some of these problems were also observed in the
vss discussion, I think this will make things easier for all of us:
https://lore.kernel.org/all/20260421-swap-table-p4-v3-0-2f23759a76bc@xxxxxxxxxxx/