Re: [PATCH v6 15/43] KVM: guest_memfd: Handle lru_add fbatch refcounts during conversion safety check

From: Fuad Tabba

Date: Thu May 21 2026 - 03:15:52 EST


On Thu, 7 May 2026 at 21:22, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@xxxxxxxxxx> wrote:
>
> From: Ackerley Tng <ackerleytng@xxxxxxxxxx>
>
> When checking if a guest_memfd folio is safe for conversion, its refcount
> is examined. A folio may be present in a per-CPU lru_add fbatch, which
> temporarily increases its refcount. This can lead to a false positive,
> incorrectly indicating that the folio is in use and preventing the
> conversion, even if it is otherwise safe. The conversion process might not
> be on the same CPU that holds the folio in its fbatch, making a simple
> per-CPU check insufficient.
>
> To address this, drain all CPUs' lru_add fbatches if an unexpectedly high
> refcount is encountered during the safety check. This is performed at most
> once per conversion request. Draining only if the folio in question may be
> lru cached.
>
> guest_memfd folios are unevictable, so they can only reside in the lru_add
> fbatch. If the folio's refcount is still unsafe after draining, then the
> conversion is truly deemed unsafe.
>
> Signed-off-by: Ackerley Tng <ackerleytng@xxxxxxxxxx>

Not an area I've worked with that much, but it seems right to me:

Reviewed-by: Fuad Tabba <tabba@xxxxxxxxxx>

Cheers,
/fuad


> ---
> mm/swap.c | 2 ++
> virt/kvm/guest_memfd.c | 18 ++++++++++++++----
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/mm/swap.c b/mm/swap.c
> index 5cc44f0de9877..3134d9d3d7c30 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -37,6 +37,7 @@
> #include <linux/page_idle.h>
> #include <linux/local_lock.h>
> #include <linux/buffer_head.h>
> +#include <linux/kvm_types.h>
>
> #include "internal.h"
>
> @@ -904,6 +905,7 @@ void lru_add_drain_all(void)
> lru_add_drain();
> }
> #endif /* CONFIG_SMP */
> +EXPORT_SYMBOL_FOR_KVM(lru_add_drain_all);
>
> atomic_t lru_disable_count = ATOMIC_INIT(0);
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 034b72b4947fb..050a8c092b1a3 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -8,6 +8,7 @@
> #include <linux/mempolicy.h>
> #include <linux/pseudo_fs.h>
> #include <linux/pagemap.h>
> +#include <linux/swap.h>
>
> #include "kvm_mm.h"
>
> @@ -596,18 +597,27 @@ static bool kvm_gmem_is_safe_for_conversion(struct inode *inode, pgoff_t start,
> const int filemap_get_folios_refcount = 1;
> pgoff_t last = start + nr_pages - 1;
> struct folio_batch fbatch;
> + bool lru_drained = false;
> bool safe = true;
> int i;
>
> folio_batch_init(&fbatch);
> while (safe && filemap_get_folios(mapping, &start, last, &fbatch)) {
>
> - for (i = 0; i < folio_batch_count(&fbatch); ++i) {
> + for (i = 0; i < folio_batch_count(&fbatch);) {
> struct folio *folio = fbatch.folios[i];
>
> - if (folio_ref_count(folio) !=
> - folio_nr_pages(folio) + filemap_get_folios_refcount) {
> - safe = false;
> + safe = (folio_ref_count(folio) ==
> + folio_nr_pages(folio) +
> + filemap_get_folios_refcount);
> +
> + if (safe) {
> + ++i;
> + } else if (folio_may_be_lru_cached(folio) &&
> + !lru_drained) {
> + lru_add_drain_all();
> + lru_drained = true;
> + } else {
> *err_index = folio->index;
> break;
> }
>
> --
> 2.54.0.563.g4f69b47b94-goog
>
>