Re: [PATCH RFC] mm: ghost swapfile support for zswap

From: Nhat Pham
Date: Mon Nov 24 2025 - 10:37:04 EST


On Fri, Nov 21, 2025 at 5:52 PM Chris Li <chrisl@xxxxxxxxxx> wrote:
>
> On Fri, Nov 21, 2025 at 3:40 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> >
> > On Fri, Nov 21, 2025 at 01:31:43AM -0800, Chris Li wrote:
> > > The current zswap requires a backing swapfile. The swap slot used
> > > by zswap is not able to be used by the swapfile. That waste swapfile
> > > space.
> > >
> > > The ghost swapfile is a swapfile that only contains the swapfile header
> > > for zswap. The swapfile header indicate the size of the swapfile. There
> > > is no swap data section in the ghost swapfile, therefore, no waste of
> > > swapfile space. As such, any write to a ghost swapfile will fail. To
> > > prevents accidental read or write of ghost swapfile, bdev of
> > > swap_info_struct is set to NULL. Ghost swapfile will also set the SSD
> > > flag because there is no rotation disk access when using zswap.
> >
> > Zswap is primarily a compressed cache for real swap on secondary
> > storage. It's indeed quite important that entries currently in zswap
> > don't occupy disk slots; but for a solution to this to be acceptable,
> > it has to work with the primary usecase and support disk writeback.
>
> Well, my plan is to support the writeback via swap.tiers.
>
> > This direction is a dead-end. Please take a look at Nhat's swap
> > virtualization patches. They decouple zswap from disk geometry, while
> > still supporting writeback to an actual backend file.
>
> Yes, there are many ways to decouple zswap from disk geometry, my swap
> table + swap.tiers design can do that as well. I have concerns about
> swap virtualization in the aspect of adding another layer of memory
> overhead addition per swap entry and CPU overhead of extra xarray
> lookup. I believe my approach is technically superior and cleaner.

True, but the static nature of the current swapfile infrastructure
also imposes an space overhead and/or operational overhead.

I did play around with a prototype with a ghost swapfile for virtual
swap, but had to stop because of the swapfile overhead for larger
virtual swap space.

> Both faster and cleaner. Basically swap.tiers + VFS like swap read
> write page ops. I will let Nhat clarify the performance and memory

That just solves static placement, no? Backend transfer requires
something extra/orthogonal.

> overhead side of the swap virtualization.
>
> I am not against swap entry redirection. Just the swap virtualization

There will be redirection either way. I don't think it's avoidable.
The only option is whether to shove it into the backend (what zram is
doing), or having a generalized module (swap virtualization).

Or do a page table walk every time you want to do backend transfer
(what swapoff is doing).

> series needs to compare against the alternatives in terms of memory
> overhead and throughput.
> Solving it from the swap.tiers angle is cleaner.
>
> > Nacked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
>
> I take that the only relevant part is you are zswap maintainer and I
> am the swap maintainer. Fine. I got the message. I will leave the
> zswap alone. I will find other ways to address the memory base swap
> tiers in swap.tiers.

Please keep this discussion technical and not pull ranks unnecessarily.

>
> Chris