Re: [PATCH v3] mm/zswap: invalidate old entry when store fail or !zswap_enabled

From: Chengming Zhou
Date: Wed Feb 07 2024 - 06:36:41 EST


On 2024/2/7 11:38, chengming.zhou@xxxxxxxxx wrote:
> From: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
>
> We may encounter duplicate entry in the zswap_store():
>
> 1. swap slot that freed to per-cpu swap cache, doesn't invalidate
> the zswap entry, then got reused. This has been fixed.
>
> 2. !exclusive load mode, swapin folio will leave its zswap entry
> on the tree, then swapout again. This has been removed.
>
> 3. one folio can be dirtied again after zswap_store(), so need to
> zswap_store() again. This should be handled correctly.
>
> So we must invalidate the old duplicate entry before insert the
> new one, which actually doesn't have to be done at the beginning
> of zswap_store(). And this is a normal situation, we shouldn't
> WARN_ON(1) in this case, so delete it. (The WARN_ON(1) seems want
> to detect swap entry UAF problem? But not very necessary here.)
>
> The good point is that we don't need to lock tree twice in the
> store success path.
>
> Note we still need to invalidate the old duplicate entry in the
> store failure path, otherwise the new data in swapfile could be
> overwrite by the old data in zswap pool when lru writeback.
>
> We have to do this even when !zswap_enabled since zswap can be
> disabled anytime. If the folio store success before, then got
> dirtied again but zswap disabled, we won't invalidate the old
> duplicate entry in the zswap_store(). So later lru writeback
> may overwrite the new data in swapfile.
>
> Fixes: 42c06a0e8ebe ("mm: kill frontswap")
> Cc: <stable@xxxxxxxxxxxxxxx>
> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> Acked-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
> ---
> v3:
> - Fix a few grammatical problems in comments, per Yosry.
>
> v2:
> - Change the duplicate entry invalidation loop to if, since we hold
> the lock, we won't find it once we invalidate it, per Yosry.
> - Add Fixes tag.
> ---
> mm/zswap.c | 33 ++++++++++++++++-----------------
> 1 file changed, 16 insertions(+), 17 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index cd67f7f6b302..d9d8947d6761 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1518,18 +1518,8 @@ bool zswap_store(struct folio *folio)
> return false;
>
> if (!zswap_enabled)
> - return false;
> + goto check_old;
>
> - /*
> - * If this is a duplicate, it must be removed before attempting to store
> - * it, otherwise, if the store fails the old page won't be removed from
> - * the tree, and it might be written back overriding the new data.
> - */
> - spin_lock(&tree->lock);
> - entry = zswap_rb_search(&tree->rbroot, offset);
> - if (entry)
> - zswap_invalidate_entry(tree, entry);
> - spin_unlock(&tree->lock);
> objcg = get_obj_cgroup_from_folio(folio);
> if (objcg && !obj_cgroup_may_zswap(objcg)) {
> memcg = get_mem_cgroup_from_objcg(objcg);
> @@ -1608,14 +1598,12 @@ bool zswap_store(struct folio *folio)
> /* map */
> spin_lock(&tree->lock);
> /*
> - * A duplicate entry should have been removed at the beginning of this
> - * function. Since the swap entry should be pinned, if a duplicate is
> - * found again here it means that something went wrong in the swap
> - * cache.
> + * The folio may have been dirtied again, invalidate the
> + * possibly stale entry before inserting the new entry.
> */
> - while (zswap_rb_insert(&tree->rbroot, entry, &dupentry) == -EEXIST) {
> - WARN_ON(1);
> + if (zswap_rb_insert(&tree->rbroot, entry, &dupentry) == -EEXIST) {
> zswap_invalidate_entry(tree, dupentry);
> + VM_WARN_ON(zswap_rb_insert(&tree->rbroot, entry, &dupentry));

Oh, I just realized this is empty if !CONFIG_DEBUG_VM, will post v4.

Thanks.

> }
> if (entry->length) {
> INIT_LIST_HEAD(&entry->lru);
> @@ -1638,6 +1626,17 @@ bool zswap_store(struct folio *folio)
> reject:
> if (objcg)
> obj_cgroup_put(objcg);
> +check_old:
> + /*
> + * If the zswap store fails or zswap is disabled, we must invalidate the
> + * possibly stale entry which was previously stored at this offset.
> + * Otherwise, writeback could overwrite the new data in the swapfile.
> + */
> + spin_lock(&tree->lock);
> + entry = zswap_rb_search(&tree->rbroot, offset);
> + if (entry)
> + zswap_invalidate_entry(tree, entry);
> + spin_unlock(&tree->lock);
> return false;
>
> shrink: