Re: [PATCH v2] mm/zswap: invalidate old entry when store fail or !zswap_enabled

From: Yosry Ahmed
Date: Tue Feb 06 2024 - 22:29:04 EST


On Wed, Feb 07, 2024 at 02:58:27AM +0000, chengming.zhou@xxxxxxxxx wrote:
> From: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
>
> We may encounter duplicate entry in the zswap_store():
>
> 1. swap slot that freed to per-cpu swap cache, doesn't invalidate
> the zswap entry, then got reused. This has been fixed.
>
> 2. !exclusive load mode, swapin folio will leave its zswap entry
> on the tree, then swapout again. This has been removed.
>
> 3. one folio can be dirtied again after zswap_store(), so need to
> zswap_store() again. This should be handled correctly.
>
> So we must invalidate the old duplicate entry before insert the
> new one, which actually doesn't have to be done at the beginning
> of zswap_store(). And this is a normal situation, we shouldn't
> WARN_ON(1) in this case, so delete it. (The WARN_ON(1) seems want
> to detect swap entry UAF problem? But not very necessary here.)
>
> The good point is that we don't need to lock tree twice in the
> store success path.
>
> Note we still need to invalidate the old duplicate entry in the
> store failure path, otherwise the new data in swapfile could be
> overwrite by the old data in zswap pool when lru writeback.
>
> We have to do this even when !zswap_enabled since zswap can be
> disabled anytime. If the folio store success before, then got
> dirtied again but zswap disabled, we won't invalidate the old
> duplicate entry in the zswap_store(). So later lru writeback
> may overwrite the new data in swapfile.
>
> Fixes: 42c06a0e8ebe ("mm: kill frontswap")
> Cc: <stable@xxxxxxxxxxxxxxx>
> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>

LGTM with a few grammatical fixes below:

Acked-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>

> @@ -1608,14 +1598,12 @@ bool zswap_store(struct folio *folio)
> /* map */
> spin_lock(&tree->lock);
> /*
> - * A duplicate entry should have been removed at the beginning of this
> - * function. Since the swap entry should be pinned, if a duplicate is
> - * found again here it means that something went wrong in the swap
> - * cache.
> + * The folio could be dirtied again, invalidate the possible old entry
> + * before insert this new entry.

/*
* The folio may have been dirtied again, invalidate the
* possibly stale entry before inserting the new entry.
*/

> */
> - while (zswap_rb_insert(&tree->rbroot, entry, &dupentry) == -EEXIST) {
> - WARN_ON(1);
> + if (zswap_rb_insert(&tree->rbroot, entry, &dupentry) == -EEXIST) {
> zswap_invalidate_entry(tree, dupentry);
> + VM_WARN_ON(zswap_rb_insert(&tree->rbroot, entry, &dupentry));
> }
> if (entry->length) {
> INIT_LIST_HEAD(&entry->lru);
> @@ -1638,6 +1626,17 @@ bool zswap_store(struct folio *folio)
> reject:
> if (objcg)
> obj_cgroup_put(objcg);
> +check_old:
> + /*
> + * If zswap store fail or zswap disabled, we must invalidate possible
> + * old entry which previously stored by this folio. Otherwise, later
> + * writeback could overwrite the new data in swapfile.
> + */


/*
* If the zswap store fails or zswap is disabled, we must invalidate the
* possibly stale entry which was previously stored at this offset.
* Otherwise, writeback could overwrite the new data in the swapfile.
*/

> + spin_lock(&tree->lock);
> + entry = zswap_rb_search(&tree->rbroot, offset);
> + if (entry)
> + zswap_invalidate_entry(tree, entry);
> + spin_unlock(&tree->lock);
> return false;
>
> shrink:
> --
> 2.40.1
>