Re: [PATCH v5 00/21] Virtual Swap Space

From: Nhat Pham

Date: Tue Apr 14 2026 - 12:40:21 EST


On Mon, Apr 13, 2026 at 8:29 PM Kairui Song <ryncsn@xxxxxxxxx> wrote:
>
> On Tue, Apr 14, 2026 at 11:05 AM YoungJun Park <youngjun.park@xxxxxxx> wrote:
> >
>
> Hi All,
>
> > On Sat, Apr 11, 2026 at 06:40:44PM -0700, Nhat Pham wrote:
> > > > 1. Modularization
> > > >
> > > > You removed CONFIG_* and went with a unified approach. I recall
> > > > you were also considering a module-based structure at some point.
> > > > What are your thoughts on that direction?
> > > >
> > >
> > > The CONFIG-based approach was a huge mess. It makes me not want to
> > > look at the code, and I'm the author :)
> > >
> > > > If we take that approach, we could extend the recent swap ops
> > > > patchset (https://lore.kernel.org/linux-mm/20260302104016.163542-1-bhe@xxxxxxxxxx/)
> > > > as follows:
> > > > - Make vswap a swap module
> > > > - Have cluster allocation functions reside in swapops
> > > > - Enable vswap through swapon
> > >
> > > Hmmmmm.
> >
> > I think this would be a happy world, but I wonder what others think.
> > Anyway, I'm looking forward to the future direction.
> >
>
> Yeah, I agree with this.
>
> And I do think swapoff of the virtual space itself is also necessary,
> we really need a failsafe, e.g. a clean way to drop the swap
> cache and data, kind of like drop_caches or shrinker fs are
> commonly used.
>
> > > > 2. Flash-friendly swap integration (for my use case)
> > > >
> > > > I've been thinking about the flash-friendly swap concept that
> > > > I mentioned before and recently proposed:
> > > > (https://lore.kernel.org/linux-mm/aZW0voL4MmnMQlaR@yjaykim-PowerEdge-T330/)
> > > >
> > > > One of its core functions requires buffering RAM-swapped pages
> > > > and writing them sequentially at an appropriate time -- not
> > > > immediately, but in proper block-sized units, sequentially.
> > > >
> > > > This means allocated offsets must essentially be virtual, and
> > > > physical offsets need to be managed separately at the actual
> > > > write time.
> > > >
> > > > If we integrate this into the current vswap, we would either
> > > > need vswap itself to handle the sequential writes (bypassing
> > > > the physical device and receiving pages directly), or swapon
> > > > a swap device and have vswap obtain physical offsets from it.
> > > > But since those offsets cannot be used directly (due to
> > > > buffering and sequential write requirements), they become
> > > > virtual too, resulting in:
> > > >
> > > > virtual -> virtual -> physical
> > > >
> > > > This triple indirection is not ideal.
> > > >
> > > > However, if the modularization from point 1 is achieved and
> > > > vswap acts as a swap device itself, then we can cleanly
> > > > establish a:
> > > >
> > > > virtual -> physical
> > >
> > > I read that thread sometimes ago. Some remarks:
> > >
> > > 1. I think Christoph has a point. Seems like some of your ideas ( are
> > > broadly applicable to swap in general. Maybe fixing swap infra
> > > generally would make a lot of sense?
> >
> > Broadly speaking, there are two main ideas:
> > 1. Swap I/O buffering (which is also tied to cluster management issues)
> > 2. Deduplication
> >
> > Are you leaning towards the view that these two should be placed in a
> > higher layer?
>
> IMHO the swap infra should be doing less, not more, so we can have
> more flexible design, and different backends can implement their own
> way to manage the data and layer. e.g. Having one backend being
> flash friendly and it can do this without caring or affecting other devices
> or backends.

I think that's what Youngjun already has, unless I misunderstand his
descriptions.

>
> > If it goes into ZSWAP, there would definitely be a clear advantage of
> > seeing dedup benefits across all swap devices. It's a technically
> > interesting area, and I'd like to discuss it in a separate thread if
> > I have more ideas or thoughts.
>
> Just branstorm... Why don't we just merge these identical pages like
> KSM? Maybe at least zero folios might benefit a lot if we keep them
> mapped as RO instead of recording them in swap, seems better in the
> long term?

That's our preferred approach too. We just didn't manage to get that
to work (yet). :)