Re: [PATCH v3 2/4] mm/zswap: Implement proactive writeback

From: YoungJun Park

Date: Fri Jun 12 2026 - 03:28:04 EST


On Thu, Jun 11, 2026 at 12:12:40PM -0700, Shakeel Butt wrote:
> On Thu, Jun 11, 2026 at 05:45:04PM +0000, Yosry Ahmed wrote:
> > On Tue, Jun 09, 2026 at 01:19:13PM +0900, YoungJun Park wrote:
> > > On Mon, Jun 08, 2026 at 03:27:07PM -0700, Yosry Ahmed wrote:
> > >
> > > +Chris +Kairui +Baoquan
> > >
> > > Hello
> > >
> > > Thanks for inviting me to the discussion, Shakeel.
> > >
> > > > > > > Youngjun is working on swap tiers. At the moment he is more interested in
> > > > > > > allowing a specific swap device to a memcg or not. I can imagine in future there
> > > > > > > will be use-cases where there will be a need to demote data on higher tier swap
> > > > > > > to lower tier swap. What would be the appropriate interface?
> > >
> > > Speaking of my work on swap tiers, I recently submitted a patch and am
> > > currently considering memcg integration:
> > > https://lore.kernel.org/linux-mm/20260527062247.3440692-1-youngjun.park@xxxxxxx/
> > >
> > > The future use-cases imagined above seem to align with this
> > > direction. (BTW, I am currently waiting for reviews/feedback from the memcg
> > > folks on this patch. Any reviews would be highly appreciated!)
> > >
> > > We could potentially assign a target tier
> > > for writeback within the existing memory.zswap.writeback interface.
> > >
> > > For instance, '0' could mean disabled, while non-zero values could represent
> > > specific tiers, which would maintain backward compatibility with the current
> > > version. Alternatively, if zswap is treated as the default top tier,
> > > the `memory.swap.tiers` interface could potentially replace `memory.zswap.writeback`.
> > >
> > > Furthermore, this could be expanded so that each swap tier can demote data
> > > user-triggered demotion between swap tiers.
> > >
> > > Based on the current patch's ideas combined with my swap tiers concept:
> > >
> > > Assuming a hierarchy like:
> > > zswap -> tier1 (SSD swap) -> tier2 (HDD swap) -> tier3 (Network swap)
> > >
> > > We could configure the active tiers via a setting like `memory.swap.tiers`
> > > (tier2 enabled, tier3 enabled).
> > >
> > > For example, the concept of `echo "100M zswap_writeback_only > memory.reclaim"`
> > > could be extended. A user could run `echo "100M tier2 > memory.reclaim"`
> > > to explicitly trigger demotion from tier2 to tier3.
> > > (BTW, if we combine these features, my personal preference for the keyword
> > > format would be `<size> <demote_prefix><tier_name>`. I think it would be
> > > better to explicitly indicate that it is a swap demotion by using a specific
> > > prefix followed by the tier name.
> > > Or make demote prefix another key is also possible)
> >
> > I am not sure if proactive demotion between swap tiers would be driven
> > by memory.reclaim, I am guessing a new interface might be more suitable.
> > But yes, you are right that it's very possible that
> > 'zswap_writeback_only' with memory.reclaim will become obsolete once
> > swap tiering matures and starts supporting things like proactive
> > demotion.
> >
> > Part of me wants to wait until the swap tiering interfaces are figured
> > out so that we don't end up with redundant interfaces, but I also don't
> > want to hold Hao's work since it doesn't directly depend on swap
> > tiering.
> However I would need zswap folks (Yosry & Nhat) help in figuring out swap tiers
> interfaces. Zswap is the current top tier swap usage in real world. I want
> zswap users to eaily (and hopefully transparently) migrate to swap tiers.

> > Shakeel, how do you want to handle this? I think there's a few options:
> >
> > 1. Add zswap_writeback_only now, and when we have swap tiering demotion
> > it becomes a redundant interface, like memory.zswap.writeback -- or
> > maybe we try to deprecate both of them at that point. It's difficult to
> > remove interfaces tho, but maybe easier to stop supporting
> > zswap_writeback_only.
> >
> > 2. Add zswap_writeback_only behind an experimental config option, to
> > unblock development but have a line of sight to dropping support once we
> > have a swap tiering interface.
> >
> > 3. Wait until we figure out the swap tiering interfaces and then add
> > the proactive zswap writeback as part of it.
> >
> > WDYT?
>
> Is Hao's work needed for some followup work/development? The earliest Hao's
> work can is 7.3, so if we aim to figure out swap tiering interfaces in next
> couple of weeks then option 3 is the way to go. If swap tiers take more time
> then we can discuss other options as well.
> However I would need zswap folks (Yosry & Nhat) help in figuring out swap tiers
> interfaces. Zswap is the current top tier swap usage in real world. I want
> zswap users to eaily (and hopefully transparently) migrate to swap tiers.

I am looking forward to the discussion on this interface!

To help boost the discussion and progress, I would like to share a few of my thoughts.
We could either introduce a new interface to trigger demotion/promotion,
or we could reuse the existing one (using tier just internally)

Based on the memcg interface currently proposed in swap_tier
(memory.swap.tiers, memory.swap.tiers.effective), I think it aligns well
with the current direction. It provides a foundation for selectively
targeting devices in tier order.

To summarize the discussions so far, the following points align well.

- Per-cgroup swap control, as I suggested.
- Proactive zswap writeback (Hao's usecase)
- Swap device target demotion(if it wants selective, then it is more better), as you mentioned:
https://lore.kernel.org/linux-mm/aicZ-5GX9De3MAU7@xxxxxxxxx/
- Virtual Swap on/off in the future, as Nhat mentioned:
https://lore.kernel.org/linux-mm/20260528212955.1912856-1-nphamcs@xxxxxxxxx/
- The memory.zswap.writeback alternative (no hierarchy model conflict)
- zswap is first swap tier.
- Promotion. (Also better for selectve usage)
- tier based swap policy (e.g round-robin...)

To accelerate this work, I believe we should reach a consensus and
merge the currently proposed swap_tier interface :)

If the above approach is difficult, I would like to suggest an
alternative for progress with the memcg interfaces removed:

1) We could make zswap the first tier and create
a use case where memory.zswap.writeback internally is handled by tier logic.

2) Or simply merge the swap_tier infrastructure itself first.

This would allow the swap_tier infrastructure to be merged and discussed
more easily.

If it takes longer to adopt swap_tier anyway, by doing so we progress next step
as a experimental feature.

- Apply per-cgroup swap as an experimental (debugfs) feature.
- Apply Hao's use case experimentally or as it is as Yosry suggested.
(future migration to swap tier)

How do you think?

(FYI: My emails to kernel.org are failing due to internal server issues.)

Thank you
Youngjun Park