Re: [PATCH v2 1/2] mm/memcontrol: respect zswap.writeback setting from parent cg too
From: Mike Yuan
Date: Tue Aug 20 2024 - 05:39:49 EST
On 2024-08-19 at 12:09 -0700, Yosry Ahmed wrote:
> On Fri, Aug 16, 2024 at 7:44 AM Mike Yuan <me@xxxxxxxxxxx> wrote:
> >
> > Currently, the behavior of zswap.writeback wrt.
> > the cgroup hierarchy seems a bit odd. Unlike zswap.max,
> > it doesn't honor the value from parent cgroups. This
> > surfaced when people tried to globally disable zswap writeback,
> > i.e. reserve physical swap space only for hibernation [1] -
> > disabling zswap.writeback only for the root cgroup results
> > in subcgroups with zswap.writeback=1 still performing writeback.
> >
> > The inconsistency became more noticeable after I introduced
> > the MemoryZSwapWriteback= systemd unit setting [2] for
> > controlling the knob. The patch assumed that the kernel would
> > enforce the value of parent cgroups. It could probably be
> > workarounded from systemd's side, by going up the slice unit
> > tree and inheriting the value. Yet I think it's more sensible
> > to make it behave consistently with zswap.max and friends.
> >
> > [1]
> > https://wiki.archlinux.org/title/Power_management/Suspend_and_hibernate#Disable_zswap_writeback_to_use_the_swap_space_only_for_hibernation
> > [2] https://github.com/systemd/systemd/pull/31734
> >
> > Changes in v2:
> > - Actually base on latest tree (is_zswap_enabled() ->
> > zswap_is_enabled())
> > - Updated Documentation/admin-guide/cgroup-v2.rst to reflect the
> > change
> >
> > Link to v1:
> > https://lore.kernel.org/linux-kernel/20240814171800.23558-1-me@xxxxxxxxxxx/
> >
> > Cc: Nhat Pham <nphamcs@xxxxxxxxx>
> > Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> >
> > Signed-off-by: Mike Yuan <me@xxxxxxxxxxx>
> > Reviewed-by: Nhat Pham <nphamcs@xxxxxxxxx>
>
> LGTM,
> Acked-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
>
> > ---
> > Documentation/admin-guide/cgroup-v2.rst | 5 ++++-
> > mm/memcontrol.c | 9 ++++++++-
> > 2 files changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/cgroup-v2.rst
> > b/Documentation/admin-guide/cgroup-v2.rst
> > index 86311c2907cd..80906cea4264 100644
> > --- a/Documentation/admin-guide/cgroup-v2.rst
> > +++ b/Documentation/admin-guide/cgroup-v2.rst
> > @@ -1719,7 +1719,10 @@ The following nested keys are defined.
> > memory.zswap.writeback
> > A read-write single value file. The default value is "1".
> > The
> > initial value of the root cgroup is 1, and when a new
> > cgroup is
> > - created, it inherits the current value of its parent.
> > + created, it inherits the current value of its parent. Note
> > that
> > + this setting is hierarchical, i.e. the writeback would be
> > + implicitly disabled for child cgroups if the upper
> > hierarchy
> > + does so.
> >
> > When this is set to 0, all swapping attempts to swapping
> > devices
> > are disabled. This included both zswap writebacks, and
> > swapping due
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index f29157288b7d..327b2b030639 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -5320,7 +5320,14 @@ void obj_cgroup_uncharge_zswap(struct
> > obj_cgroup *objcg, size_t size)
> > bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg)
> > {
> > /* if zswap is disabled, do not block pages going to the
> > swapping device */
> > - return !zswap_is_enabled() || !memcg || READ_ONCE(memcg-
> > >zswap_writeback);
> > + if (!zswap_is_enabled())
> > + return true;
>
> This is orthogonal to this patch, but I just realized that we
> completely ignore memory.zswap_writeback if zswap is disabled. This
> means that if a cgroup has disabled writeback, then zswap is globally
> disabled for some reason, we stop respecting the cgroup knob. I guess
> the rationale could be that we want to help get pages out of zswap as
> much as possible to honor zswap's disablement? Nhat, did I get that
> right?
Hmm, I think the current behavior makes more sense. If zswap is
completely
disabled, it seems intuitive that zswap-related knobs lose their
effect.
> I feel like it's a little bit odd to be honest, but I don't have a
> strong opinion on it. Maybe we should document this behavior better.
But clarify this in the documentation certainly sounds good :)
>
> > +
> > + for (; memcg; memcg = parent_mem_cgroup(memcg))
> > + if (!READ_ONCE(memcg->zswap_writeback))
> > + return false;
> > +
> > + return true;
> > }
> >
> > static u64 zswap_current_read(struct cgroup_subsys_state *css,
> >
> > base-commit: d07b43284ab356daf7ec5ae1858a16c1c7b6adab
> > --
> > 2.46.0
> >
> >