Re: [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values

From: Ackerley Tng

Date: Thu May 28 2026 - 16:49:41 EST


Sean Christopherson <seanjc@xxxxxxxxxx> writes:

> On Thu, May 28, 2026, Ackerley Tng wrote:
>> Sean Christopherson <seanjc@xxxxxxxxxx> writes:
>>
>> >
>> > [...snip...]
>> >
>> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
>> > index bf9659a7b0f6..a1cb72e66288 100644
>> > --- a/virt/kvm/guest_memfd.c
>> > +++ b/virt/kvm/guest_memfd.c
>> > @@ -640,15 +640,16 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
>> > }
>> >
>> > int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
>> > - unsigned int fd, loff_t offset)
>> > + unsigned int fd, uoff_t offset)
>> > {
>> > - loff_t size = slot->npages << PAGE_SHIFT;
>> > + uoff_t size = slot->npages << PAGE_SHIFT;
>>
>> I get why uoff_t is chosen for byte offset, but why is uoff_t also
>> picked for size, over size_t?
>
> Because size describes the max offset.
>
>> Not trying to nitpick, guest_memfd deals with lots of sizes, byte and
>> page offsets so I'd like to derive a mental model for using these types
>> in functions.
>
> Earlier you asked what I meant by "same domain", and I forgot to respond. What
> I meant by "domain" is that both "size" and "offset" operate in the "number of
> bytes in the file" domain, and so should use the same type.
>
> E.g. size_t isn't literally just for sizes, it's also used when comparing against
> a size, e.g.
>
> $ git grep "size_t i;" | wc -l
> 476
>
> Changing all of those to uoff_t because their effectively offsets, not sizes,
> would be nonsensical.
>

Got it, so the takeaway is that local variables' type (especially if
only used in one case) should mostly align with what it's being compared
with.

>> > unsigned long start, end;
>> > struct gmem_file *f;
>> > struct inode *inode;
>> > struct file *file;
>> > int r = -EINVAL;
>> >
>> > + BUILD_BUG_ON(sizeof(gpa_t) != sizeof(offset));
>> > BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff));
>> >
>>
>> I never knew this was meant to show the concept/typing relationship
>> between gpa_t and byte offset, gfn_t and page offset. Can we add a
>> comment to explain the presence of BUILD_BUG_ON()?
>
> Eh, I'd honestly rather not. I agree that on the whole, the kernel tends to be
> far too stingy with comments. However, at some point a baseline of knowledge is
> required. For me, understanding trunctation falls well below that line. It
> does require the reader to know that guest_memfd bindings are gpa:offset, but
> that too is core/basic knowledge that the reader needs to have to make any sense
> of this code.
>

I agree that both truncation and knowing that guest_memfd bindings are
gpa:offset are both below the line, the part that's new is using
BUILD_BUG_ON() to remind people of the binding. Is this BUILD_BUG_ON()
also the part that strictly requires guest_memfd to be 64-bit only?

>> Also, what's the rationale for picking BUILD_BUG_ON() over
>> static_assert()?
>
> https://lore.kernel.org/all/aI05DvQlMWJXewUi@xxxxxxxxxx
>

This is helpful, thanks!

>> static_assert() could be placed outside block scope which makes this
>> relationship declaration more general/global. Putting it in the function
>> makes it seem local (seems to suggest trying to assert some guard for just
>> this function).
>
> Well, yeah, because it is local. _This_ code relies on the sizes being the same,
> otherwise we'd have to deal with potential truncation, whereas overall KVM does
> NOT. The purpose of adding asserts like this is to document an assumption and
> guard against future breakage.
>

I saw the gpa:offset and gfn:page_offset relationships as local to
guest_memfd as a whole rather than just this function.

Either way good to know for next time!

> Just because we _can_ put the assertion in a global location doesn't mean we
> should.
>
> KVM does use static_assert(). But without exception (at least in x86, haven't
> checked other architectures), every usage is still co-located with the exact code
> that relies on the assumption that's guarded by the assertion.