Re: [PATCH v5 00/21] Virtual Swap Space

From: Nhat Pham

Date: Fri May 01 2026 - 09:42:45 EST


On Tue, Apr 28, 2026 at 7:46 PM Kairui Song <ryncsn@xxxxxxxxx> wrote:
>
> On Tue, Apr 28, 2026 at 2:23 AM Yosry Ahmed <yosry@xxxxxxxxxx> wrote:
> >
> > On Fri, Apr 24, 2026 at 12:52 PM Kairui Song <ryncsn@xxxxxxxxx> wrote:
> > >
> > > On Sat, Apr 25, 2026 at 3:12 AM Yosry Ahmed <yosry@xxxxxxxxxx> wrote
> > > > Why >16 bytes? Do we need anything extra other than the reverse
> > > > mapping? Also why do we need a double lookup?
> > >
> > > You will have to store at least the following info: memcg (2 bytes),
> > > shadow (8 bytes), count (at least 1 bytes), and revert mapping (8
> > > bytes, since you have to address a full virtual swap space). And some
> > > type info is also needed. Part of them can be shrinked but still,
> > > scientifically, merging two layers into one is considered a kind of
> > > optimization.
> > >
> > > You need lookup the virtual layer, then the lower layer for many
> > > decision making, is was discussed before to introduce more cache bit
> > > or things like that and I think that is getting over complex, reminds
> > > me of the slot cache or HAS_CACHE thing...:
> > > https://lore.kernel.org/linux-mm/CAMgjq7DJrtE-jARik849kCufd0qNnZQs7C8fcyzVOKE14-O+Dw@xxxxxxxxxxxxxx/
> >
> > I think that's where the disconnect is. You are considering these two
> > separate layers, each with its own metadata. The metadata should only
> > live in one place.
> >
> > If we only have swap tables in the virtual swap layer (with the
> > metadata), backends do not have to carry the metadata. In this case,
> > backends should only have a reverse mapping (if needed), and some
> > internal data structure (e.g. bitmaps) to track usage.
>
> Ah, you are right. This is currently an intermediate state, that
> problem might be gone if we unified everything.

What do you mean here?

>
> > This is difficult to achieve if the virtual swap layer is optional,
> > because then the metadata can live in different places. This is why I
>
> But that's not difficult to achieve at all with an optional layer, and
> actually will be achieved naturally without any design change with the
> RFC I posted. Swap count / cgroup / shadow all stay in the top layer,
> lower layer is "reverse map" only (the undone part though, it will
> require to move the cluster cache from global to device level, which
> is also required for YoungJun's tier or any functional tiering to
> work, we may run into more and more detail issue like this).
>
> Might even be easier that way, it's pretty close to the unified states I think.

I feel like you're moving towards the other direction, no? Seems like
you are unifying swap metadata, which is good (vswap will also want to
do this), but the problem is, the lower layer will have to allocate
memory for these metadata too...

Say vswap is optional and runtime enabled. How do you structure a
physical swap device's metadata? Some of the slots might be directly
mapped to PTEs, some might back vswap slots. These two cases require a
two completely different set of metadata: the former needs reference
count, swap cache, swap cgroup etc., whereas the latter only needs
reverse mapping...

I don't think we should mix vswap and non vswap slots in the same
type/address space.