Re: [PATCH v2 2/5] KVM: guest_memfd: Fix possible signed integer overflow

From: Ackerley Tng

Date: Wed May 27 2026 - 16:23:23 EST


Sean Christopherson <seanjc@xxxxxxxxxx> writes:

>
> [...snip...]
>
>> >> [Use size_t for size instead of u64]
>> >
>> > Why? Oh, right, because kvm_memory_slot.npages is an "unsigned long". The
>> > discrepancy between a u64 for the offset and a size_t for the size is confusing,
>> > as they are both conceptually in the same "domain".
>> >
>> > Rather than u64 and size_t, we should use pgoff_t, which is what KVM already uses
>> > as the storage for kvm_memory_slot.gmem.pgoff.
>> >
>>
>> I picked size_t more because I thought it was semantically correct to
>> use the size type for a size. size_t may have different sizes (64 vs
>> 32), but in the comparison offset + size > i_size_read(inode), size is
>> promoted to 64 bits, and signed inode size is cast to unsigned for
>> comparison, so I think that works.
>>
>> pgoff_t is also unsigned, but I think that should be reserved for page
>> offsets/indices?
>
> Just to avoid confusion over the definition of an offset/idnex:
>
> * The type of an index into the pagecache.
>
> I.e. it's not the 12-bit offset into a 4KiB page. Which I'm pretty sure you were
> saying as well, just want to ensure we're on the same page.
>

Wait yes I meant index as in the key if you think of the xarray of the
page cache as a key-value store.

> I like pgoff_t more than size_t because, for KVM, it's really all about addressing
> memory, thanks to the offset into guest_memfd being associated 1:1 with a GPA.

The offset into guest_memfd is associated 1:1 with a GPA, and this
offset is

offset = index << PAGE_SHIFT

> It's not perfect, because GPAs are tracked as 64-bit values, whereas the kernel
> restricts itself to "unsigned long". But that's a non-issue in practice since
> guest_memfd is 64-bit only.
>
> But conceptually, I like tracking the gmem offset as a pgoff_t to tie it back
> to using GPAs to offset/index into gmem. And for all intents and purposes, gmem
> is nothing more than a glorified pagecache :-)

So we actually want to use u64s for gmem offsets (where offset = index
<< PAGE_SHIFT), and pgoff_t for indices, since indices (aka page
offsets) are semantically the offset, counted in units of pages?

I pulled this conclusion together from filemap-related code like
filemap_add_folio() takes a pgoff_t index, so I thought gmem should
follow that and stick with pgoff_t for index/indices.