Re: [PATCH v5 00/21] Virtual Swap Space
From: Yosry Ahmed
Date: Fri Apr 24 2026 - 14:08:30 EST
On Thu, Apr 23, 2026 at 9:16 PM Kairui Song <ryncsn@xxxxxxxxx> wrote:
>
> Yosry Ahmed <yosry@xxxxxxxxxx> 于 2026年4月24日周五 04:48写道:
> > > Using a swapfile does have its benefits, though. For example, the
> > > virtual layer could act as an ordinary tier following YoungJun's
> > > design:
> > > https://lore.kernel.org/linux-mm/20260421055323.940344-1-youngjun.park@xxxxxxx/
> >
> > Hmm I didn't look too closely at this but I don't understand how
> > making it a swapfile helps with tiering? If anything, I think it makes
> > tiering more difficult. For tiering to work, we need an
> > abstraction/redirection layer, such that we don't need to update the
> > page tables (or shmem pagecache) if we demote/promote pages. That is
> > exactly the use case for a virtual swap layer. The page tables point
> > at a virtual swap ID and the backend could change transparently (e.g.
> > for zswap writeback, or tiering).
> >
> > If we make the virtual layer a swapfile, how do we demote/promote
> > without updating page tables?
> >
> > IOW, I think the whole reason we want a virtual layer is to separate
> > the backends, which would facilitate tiering. If the virtual layer is
> > itself a swapfile, wouldn't it become one of the tiers?
>
> That's exactly what I hoped, virtual layer being part of the tier.
> Tier could be set up per task / cgroup. So is the virtual tier.
Just to clarify. I don't think virtual swap should be one of the
tiers. I think it should be the mechanism through which we implement
tiering (see above). I am not sure if that's what you meant.
>
> A standalone implementation of the virtual layer is more heavy than
> being a swapfile. Actually I think at this point, it is the word
> "swapfile" is misleading now. We may rename it to "swap mapping" or
> something. A swap mapping could be physical or virtual. Virtual
> mapping can realloc from physical ones (redirect), and swapoff of
> physical ones just read its data into virtual mapping's swap cache.
I don't understand this part, please clarify. In my mind, all
references to swap entries from outside backend code should refer to a
virtual swap ID, which could be pointing to physical swap or zswap or
something else.
I *think* what you're saying is that we should make that optional, but
I don't see how this would work. If a page table is pointing at a swap
slot in a swapfile, we cannot do tiering or zswap writeback or
anything dynamic without updating page tables. So even if the system
starts off with one swapfile, we cannot assume we won't add more and
set up tiering (or enable zswap) after that, right?
I guess we'll keep the swap table in the swapfile and then we'll have
it point to a different backend, but I really don't like this design.
It's unnecessarily complicated in my opinion. Page tables will either
refer to a virtual swap ID or a physical swap slot.
I think we can simply have swap tables representing the virtual swap
space and pointing at the backend directly, whether or not we have
zswap or tiering set up or not. Is the overhead really that bad?