Re: [PATCH v2 2/5] KVM: guest_memfd: Fix possible signed integer overflow

From: Sean Christopherson

Date: Wed May 27 2026 - 15:26:13 EST


On Wed, May 27, 2026, Ackerley Tng wrote:
> Sean Christopherson <seanjc@xxxxxxxxxx> writes:
>
> > For shortlogs (and changeloges), when possible, describe the _change_ itself, not
> > its impact is. Sometimes "Fix xyz" is the best shortlog, e.g. when fixing build
> > failures, but here, I would go with:
> >
> > KVM: guest_memfd: Treat memslot binding offset+size as unsigned values
> >
> > for two reasons. First, it provides a lot more context for future readers, versus
> > "Fix possible signed integer overflow" which doesn't even capture what flow is
> > affected, how the overflow is being fixed, etc. Second, if the fix is wrong,
> > incomplete, etc., we don't end up with a follow-up patch that start with "Really
> > fix ...".
> >
>
> Thanks for explaining!
>
> > Oh, actually, three reasons. This doesn't only affect the overflow check. The
> > check on a negative offset is flawed, as it means KVM would incorrectly reject
> > bindings with (comically) large offsets.
> >
>
> Makes sense.
>
> > LOL, four. There is no bug. The size of the memslot is ((1UL << 31) - 1)
> > pages, i.e. 0x7FF_FFFFF000:
> >
> > if (id < KVM_USER_MEM_SLOTS &&
> > (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES)
> > return -EINVAL;
> >
> > and so "loff_t size" can never be negative.
>
> I think the bug was that the sum of offset + size in kvm_gmem_bind()
> when interpreted as signed integers could be smaller than
> i_size_read(inode) and allow binding.
>
> So IIUC even if size is small (and not negative), nothing catches a
> large enough offset where offset + size (interpreted as unsigned
> integers) doesn't overflow, but offset + size (interpreted as signed
> integers) overflows.

Oooh, duh, if @offset is positive, but @offset+size is negative. Yes, that's a
real bug, confirmed via selftest. I'll send a fix along with a selftest testcase.

Thanks much!

> >> Fixes: a7800aa80ea4d ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
> >> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> >> [Use size_t for size instead of u64]
> >
> > Why? Oh, right, because kvm_memory_slot.npages is an "unsigned long". The
> > discrepancy between a u64 for the offset and a size_t for the size is confusing,
> > as they are both conceptually in the same "domain".
> >
> > Rather than u64 and size_t, we should use pgoff_t, which is what KVM already uses
> > as the storage for kvm_memory_slot.gmem.pgoff.
> >
>
> I picked size_t more because I thought it was semantically correct to
> use the size type for a size. size_t may have different sizes (64 vs
> 32), but in the comparison offset + size > i_size_read(inode), size is
> promoted to 64 bits, and signed inode size is cast to unsigned for
> comparison, so I think that works.
>
> pgoff_t is also unsigned, but I think that should be reserved for page
> offsets/indices?

Just to avoid confusion over the definition of an offset/idnex:

* The type of an index into the pagecache.

I.e. it's not the 12-bit offset into a 4KiB page. Which I'm pretty sure you were
saying as well, just want to ensure we're on the same page.

I like pgoff_t more than size_t because, for KVM, it's really all about addressing
memory, thanks to the offset into guest_memfd being associated 1:1 with a GPA.
It's not perfect, because GPAs are tracked as 64-bit values, whereas the kernel
restricts itself to "unsigned long". But that's a non-issue in practice since
guest_memfd is 64-bit only.

But conceptually, I like tracking the gmem offset as a pgoff_t to tie it back
to using GPAs to offset/index into gmem. And for all intents and purposes, gmem
is nothing more than a glorified pagecache :-)