Re: [PATCH] mm/zswap: avoid touching XArray for unnecessary invalidation

From: Yosry Ahmed
Date: Fri Oct 11 2024 - 13:54:23 EST


On Fri, Oct 11, 2024 at 10:20 AM Kairui Song <ryncsn@xxxxxxxxx> wrote:
>
> From: Kairui Song <kasong@xxxxxxxxxxx>
>
> zswap_invalidation simply calls xa_erase, which acquires the Xarray
> lock first, then does a look up. This has a higher overhead even if
> zswap is not used or the tree is empty.
>
> So instead, do a very lightweight xa_empty check first, if there is
> nothing to erase, don't touch the lock or the tree.
>
> Using xa_empty rather than zswap_never_enabled is more helpful as it
> cover both case where zswap wes never used or the particular range
> doesn't have any zswap entry. And it's safe as the swap slot should
> be currently pinned by caller with HAS_CACHE.

You actually made me think whether it's better to replace
zswap_never_enabled() with something like zswap_may_have_swpentry()
that checks xa_empty() as well.

>
> Sequential SWAP in/out tests with zswap disabled showed a minor
> performance gain, SWAP in of zero page with zswap enabled also
> showed a performance gain. (swapout is basically unchanged so
> only test one case):
>
> Swapout of 2G zero page using brd as SWAP, zswap disabled
> (total time, 4 testrun, +0.1%):
> Before: 1705013 us 1703119 us 1704335 us 1705848 us.
> After: 1703579 us 1710640 us 1703625 us 1708699 us.
>
> Swapin of 2G zero page using brd as SWAP, zswap disabled
> (total time, 4 testrun, -3.5%):
> Before: 1912312 us 1915692 us 1905837 us 1912706 us.
> After: 1845354 us 1849691 us 1845868 us 1841828 us.
>
> Swapin of 2G zero page using brd as SWAP, zswap enabled
> (total time, 4 testrun, -3.3%):
> Before: 1897994 us 1894681 us 1899982 us 1898333 us
> After: 1835894 us 1834113 us 1832047 us 1833125 us
>
> Swapin of 2G random page using brd as SWAP, zswap enabled
> (total time, 4 testrun, -0.1%):
> Before: 4519747 us 4431078 us 4430185 us 4439999 us
> After: 4492176 us 4437796 us 4434612 us 4434289 us
>
> And the performance is very slightly better or unchanged for
> build kernel test with zswap enabled or disabled.
>
> Build Linux Kernel with defconfig and -j32 in 1G memory cgroup,
> using brd SWAP, zswap disabled (sys time in seconds, 6 testrun, -0.1%):
> Before: 1648.83 1653.52 1666.34 1665.95 1663.06 1656.67
> After: 1651.36 1661.89 1645.70 1657.45 1662.07 1652.83
>
> Build Linux Kernel with defconfig and -j32 in 2G memory cgroup,
> using brd SWAP zswap enabled (sys time in seconds, 6 testrun, -0.3%):
> Before: 1240.25 1254.06 1246.77 1265.92 1244.23 1227.74
> After: 1226.41 1218.21 1249.12 1249.13 1244.39 1233.01

Nice!

Do you know how the results change if we use xa_load() instead?

Or even do something like this to avoid doing the lookup twice:

XA_STATE(xas, ..);

rcu_read_lock();
entry = xas_load(&xas);
if (entry) {
xas_lock(&xas);
WARN_ON_ONCE(xas_reload(&xas) != entry);
xas_store(&xas, NULL);
xas_unlock(&xas);
}
rcu_read_unlock();

On one hand, xa_empty() is cheaper. OTOH, xas_load() will also avoid
the lock if the tree is not empty yet the entry is not there. Actually
there's no reason why we can't check both.

I think the benefit of this would be most visible with SSD swap,
because zswap_load() will erase the entry from the xarray, and
zswap_invalidate() should always be able to avoid holding the lock.

>
> Signed-off-by: Kairui Song <kasong@xxxxxxxxxxx>

Anyway, all of this is kinda orthogonal to this patch,

Acked-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>

> ---
> mm/zswap.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 7f00cc918e7c..f6316b66fb23 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1641,6 +1641,9 @@ void zswap_invalidate(swp_entry_t swp)
> struct xarray *tree = swap_zswap_tree(swp);
> struct zswap_entry *entry;
>
> + if (xa_empty(tree))
> + return;
> +
> entry = xa_erase(tree, offset);
> if (entry)
> zswap_entry_free(entry);
> --
> 2.47.0
>