Re: [PATCH 1/2] mm: zswap.c: add xarray tree to zswap

From: Chris Li
Date: Fri Jan 19 2024 - 00:24:38 EST


Hi Yosry,

On Wed, Jan 17, 2024 at 10:21 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
>
> On Wed, Jan 17, 2024 at 7:06 PM Chris Li <chrisl@xxxxxxxxxx> wrote:
> >
> > The xarray tree is added alongside the zswap RB tree.
> > Checks for the xarray get the same result as the RB tree operations.
> >
> > Rename the zswap RB tree function to a more generic function
> > name without the RB part.
>
> As I mentioned in the cover letter, I believe this should be squashed
> into the second patch. I have some comments below as well on the parts
> that should remain after the squash.
>
> [..]
> >
> > @@ -462,9 +463,9 @@ static void zswap_lru_putback(struct list_lru *list_lru,
> > /*********************************
> > * rbtree functions
> > **********************************/
> > -static struct zswap_entry *zswap_rb_search(struct rb_root *root, pgoff_t offset)
> > +static struct zswap_entry *zswap_search(struct zswap_tree *tree, pgoff_t offset)
>
> Let's change the zswap_rb_* prefixes to zswap_tree_* instead of just
> zswap_*. Otherwise, it will be confusing to have both zswap_store and
> zswap_insert (as well as zswap_load and zswap_search).

How about zswap_xa_* ?

>
> [..]
> > @@ -1790,15 +1808,21 @@ void zswap_swapon(int type)
> > void zswap_swapoff(int type)
> > {
> > struct zswap_tree *tree = zswap_trees[type];
> > - struct zswap_entry *entry, *n;
> > + struct zswap_entry *entry, *e, *n;
> > + XA_STATE(xas, tree ? &tree->xarray : NULL, 0);
> >
> > if (!tree)
> > return;
> >
> > /* walk the tree and free everything */
> > spin_lock(&tree->lock);
> > +
> > + xas_for_each(&xas, e, ULONG_MAX)
>
> Why not use xa_for_each?
>
> > + zswap_invalidate_entry(tree, e);
> > +
> > rbtree_postorder_for_each_entry_safe(entry, n, &tree->rbroot, rbnode)
> > - zswap_free_entry(entry);
>
> Replacing zswap_free_entry() with zswap_invalidate_entry() is a
> behavioral change that should be done separate from this series, but I
> am wondering why it's needed. IIUC, the swapoff code should be making
> sure there are no ongoing swapin/swapout operations, and there are no
> pages left in zswap to writeback.
>
> Is it the case that swapoff may race with writeback, such that
> writeback is holding the last remaining ref after zswap_invalidate()
> is called, and then zswap_swapoff() is called freeing the zswap entry
> while writeback is still accessing it?

For the RB tree the mapping is stored in the zswap entry as RB node.
That is different from xarray. Xarry stores the mapping outside of
zswap entry. Just freeing the entry does not remove the mapping from
xarray. Therefore it needs to call zswap_invalidate_entry() to remove
the entry from the xarray. I could call zswap_erase() then free entry.
I just think zswap_invalidate_entry() is more consistent with the rest
of the code.

Chris