Re: [PATCH v4 0/5] mm/zswap: Implement per-cgroup proactive writeback
From: Yosry Ahmed
Date: Mon Jun 22 2026 - 17:29:38 EST
On Mon, Jun 22, 2026 at 3:04 AM Youngjun Park <youngjun.park@xxxxxxx> wrote:
>
> On Mon, Jun 22, 2026 at 02:08:49PM +0800, Hao Jia wrote:
> >
> >
> > On 2026/6/21 12:20, Muchun Song wrote:
> > >
> > >
> > > > On Jun 18, 2026, at 12:48, Hao Jia <jiahao.kernel@xxxxxxxxx> wrote:
> > > >
> > > > From: Hao Jia <jiahao1@xxxxxxxxxxx>
> > > >
> > > > Zswap currently writes back pages to backing swap reactively, triggered
> > > > either by the shrinker or by the pool reaching its size limit. Although
> > > > proactive memory reclaim can automatically write back a portion of zswap
> > > > pages via the shrinker, it cannot explicitly control the amount of
> > > > writeback for a specific memory cgroup. Moreover, proactive memory reclaim
> > > > may not always be triggered during a steady state.
> > > >
> > > > In certain scenarios, it is desirable to trigger writeback in advance to
> > > > free up memory. For example, users may want to prepare for an upcoming
> > > > memory-intensive workload by flushing cold memory to the backing storage
> > > > when the system is relatively idle.
> > > >
> > > > This patch series introduces a "zswap_writeback_only" key to memory.reclaim
> > > > cgroup interface, allowing users to proactively write back cold compressed
> > > > data from zswap to the backing swap device. When specified, this key
> > > > bypasses standard memory reclaim and exclusively performs proactive zswap
> > > > writeback up to the requested budget. If omitted, the default reclaim
> > > > behavior remains unchanged.
> > > >
> > > > Example usage:
> > > > # Write back 10MB of compressed data from zswap to the backing swap
> > > > echo "10M zswap_writeback_only" > memory.reclaim
> > >
> > > I’m not entirely sure if other candidate names were already brought up
> > > in previous discussions, so my apologies if I'm repeating something here!
> > > I do think expanding memory.reclaim is a great approach. That said, I
> > > was wondering if we could make the interface a bit more concise while
> > > keeping it flexible for future extensions.
> > >
> > > Essentially, what we want is to control the specific targets of the reclaim
> > > process—such as file, anon, or zswap. What do you think about using
> > > something like "source=zswap"? For instance, if we want to reclaim 10M from
> > > zswap, the command would look like this:
> > >
> > > echo "10M source=zswap" > memory.reclaim
I like this suggestion, but I think ultimately we want proactive zswap
writeback to be part of a more general proactive swap demotion, and
zswap is just a swap tier.
> > >
[..]
>
> I also preferred sharing the `memory.reclaim` interface in the future swap demotion,
> since it already takes `zswap_writeback_only`.
> https://lore.kernel.org/all/aieUQUBHI+E3uNPW@yjaykim-PowerEdge-T330/
>
> Alternatively, we could use a separate interface as Yosry suggested
> (e.g. 'swap.tiers.demote'?).
>
> But as Nhat pointed out, allowing user-triggered demotion from the swap tier
> perspective could lead to issues like LRU inversion. We probably need to
> discuss whether this kind of user-triggered tier demotion will actually be
> supported at all.
> https://lore.kernel.org/linux-mm/CAKEwX=NfSy0XiD_UMsDOHGCwpE7sYmBmhV4Y9vk_cbnnr6J6PQ@xxxxxxxxxxxxxx/
I believe what Nhat said is that swap demotion may be used to
prevent/alleviate LRU inversion, not cause it. I don't see how
demotion can cause LRU inversion.
>
> So, IMHO..
>
> 1. If swap tier demotion is NOT exposed.
>
> We can simply choose between "source=" and `zswap_writeback_only` based
> on preference. (since there is no need to consider "swap_tier" demotion.)
>
> However, "source=" seems to offer better extensibility if it is expanded
> to file and anon use cases in the future.
>
> 2. If swap tier demotion IS exposed.
> We need to consider integration vs decoupling.
>
> (In my view, This is a design consideration. avoiding potentially
> redundant interfaces vs adding a new one if it is architecturally correct.)
>
> 2.1 Integration
> - Integrating into 'memory.reclaim':
> - "source=": Seems easier to integrate by explicitly specifying the target. (Your suggestion)
> - 'zswap_writeback_only': Harder to integrate than "source=".
>
> - Integrating into 'memory.swap.tiers.demote'
> - 'memory.swap.tiers.demote' could absorb the memory.reclaim functionality.
> (But since we only want to allow tiering for vswap+zswap cases like
> the zswap writeback feature as we discussed, the reclaim interface behavior might
> still need to stay for zswap only.)
>
> 2.2 Decoupling
> - 'memory.swap.tiers.demote' handles other swap devices (excluding zswap),
> while "source=" or 'zswap_writeback_only' handles only zswap.
I personally think making proactive zswap writeback one use case of
proactive swap demotion makes sense. I think swap demotion in general
makes sense.