Re: [PATCH v2] mm: count zeromap read and set for swapout and swapin

From: Barry Song
Date: Mon Nov 04 2024 - 20:28:32 EST


On Tue, Nov 5, 2024 at 10:24 AM Usama Arif <usamaarif642@xxxxxxxxx> wrote:
>
>
>
> On 04/11/2024 20:56, David Hildenbrand wrote:
> > On 04.11.24 19:48, Usama Arif wrote:
> >>
> >>
> >> On 04/11/2024 17:10, David Hildenbrand wrote:
> >>> On 04.11.24 17:34, Johannes Weiner wrote:
> >>>> On Mon, Nov 04, 2024 at 01:42:08PM +0100, David Hildenbrand wrote:
> >>>>> On 02.11.24 11:12, Barry Song wrote:
> >>>>>> @@ -1599,6 +1599,16 @@ The following nested keys are defined.
> >>>>>> pglazyfreed (npn)
> >>>>>> Amount of reclaimed lazyfree pages
> >>>>>> + swpin_zero
> >>>>>> + Number of pages moved into memory with zero content, meaning no
> >>>>>> + copy exists in the backend swapfile, allowing swap-in to avoid
> >>>>>> + I/O read overhead.
> >>>>>> +
> >>>>>> + swpout_zero
> >>>>>> + Number of pages moved out of memory with zero content, meaning no
> >>>>>> + copy is needed in the backend swapfile, allowing swap-out to avoid
> >>>>>> + I/O write overhead.
> >>>>>
> >>>>> Hm, can make it a bit clearer that this is a pure optimization and refer
> >>>>> to the other counters?
> >>>>>
> >>>>> swpin_zero
> >>>>> Portion of "pswpin" pages for which I/O was optimized out
> >>>>> because the page content was detected to be zero during swapout.
> >>>>
> >>>> AFAICS the zeropages currently don't show up in pswpin/pswpout, so
> >>>> these are independent counters, not subsets.
> >>>
> >>> Ah. now I understand the problem. The whole "move out of memory" "move into memory" here is quite confusing TBH. We're not moving anything, we're optimizing out the move completely ... yes, you could call it compression (below).
> >>>
> >>>>
> >>>> I'm leaning towards Barry's side on the fixes tag.
> >>>
> >>> I think the documentation when to use the Fixes: tag is pretty clear.
> >>>
> >>> Introducing new counters can hardly be considered a bugfix. Missing to adjust some counters that *existing tools* would know/use might be IMO (below).
> >>>
> >>>> When zswap handled
> >>>> the same-filled pages, we would count them in zswpin/out. From a user
> >>>> POV, especially one using zswap, the behavior didn't change, but the
> >>>> counts giving insight into this (potentially significant) VM activity
> >>>> disappeared. This is arguably a regression.
> >>>>>> swpout_zero
> >>>>> Portion of "pswout" pages for which I/O was optimized out
> >>>>> because the page content was detected to be zero.
> >>>>
> >>>> Are we sure we want to commit to the "zero" in the name here? Until
> >>>> very recently, zswap optimized all same-filled pages. It's possible
> >>>> somebody might want to bring that back down the line.
> >>>
> >>> Agreed.
> >>>
> >>>>
> >>>> In reference to the above, I'd actually prefer putting them back into
> >>>> zswpin/zswpout. Sure, they're not handled by zswap.c proper, but this
> >>>> is arguably just an implementation detail; from a user POV this is
> >>>> still just (a form of) compression in lieu of IO to the swap backend.
> >>>>
> >>>> IMO there is no need for coming up with a separate category. Just add
> >>>> them to zswpin/zswpout and remove the CONFIG_ZSWAP guards from them?
> >>>
> >>
> >> hmm, I actually don't like the idea of using zswpin/zswpout. Its a
> >> bit confusing if zswap is disabled and zswap counters are incrementing?
> >>
> >> Also, it means that when zswap is enabled, you won't be able to distinguish
> >> between zswap and zeropage optimization.
> >
> > Does it matter? Because in the past the same would have happened, no (back when this was done in zswap code)?
> >
>
> When it was in zswap code, there was zswap_same_filled_pages stat as well to see
> how many zero-filled pages were part of zswap. (Not the same as counter, but you
> could still get a good idea about same filled page usage).
>
> The other thing is it affects zram as well..
>
> Maybe We could have a hybrid approach?
> i.e. have the zswpin/zswpout counter incremented at zero filled pages as suggested,
> but then also have a zero_swapped stat that tells how much of the zeromap is
> currently set (similar to zswapped).

I still think we should keep zswap and zeromap separate. On a system
without zswap,
zero-page swap-in and swap-out are included in pswpin and pswpout counts.

Although zram has same_page_filled, it's still treated as a block
device after the
swap layer.

Thanks
Barry