Re: [PATCH v3 1/3] KVM: guest_memfd: Treat memslot binding offset+size as unsigned values

From: Sean Christopherson

Date: Thu May 28 2026 - 14:04:52 EST

On Thu, May 28, 2026, Ackerley Tng wrote:
> Sean Christopherson <seanjc@xxxxxxxxxx> writes:
>
> >
> > [...snip...]
> >
> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > index bf9659a7b0f6..a1cb72e66288 100644
> > --- a/virt/kvm/guest_memfd.c
> > +++ b/virt/kvm/guest_memfd.c
> > @@ -640,15 +640,16 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
> > }
> >
> > int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
> > - unsigned int fd, loff_t offset)
> > + unsigned int fd, uoff_t offset)
> > {
> > - loff_t size = slot->npages << PAGE_SHIFT;
> > + uoff_t size = slot->npages << PAGE_SHIFT;
>
> I get why uoff_t is chosen for byte offset, but why is uoff_t also
> picked for size, over size_t?

Because size describes the max offset.

> Not trying to nitpick, guest_memfd deals with lots of sizes, byte and
> page offsets so I'd like to derive a mental model for using these types
> in functions.

Earlier you asked what I meant by "same domain", and I forgot to respond. What
I meant by "domain" is that both "size" and "offset" operate in the "number of
bytes in the file" domain, and so should use the same type.

E.g. size_t isn't literally just for sizes, it's also used when comparing against
a size, e.g.

$ git grep "size_t i;" | wc -l
476

Changing all of those to uoff_t because their effectively offsets, not sizes,
would be nonsensical.

> > unsigned long start, end;
> > struct gmem_file *f;
> > struct inode *inode;
> > struct file *file;
> > int r = -EINVAL;
> >
> > + BUILD_BUG_ON(sizeof(gpa_t) != sizeof(offset));
> > BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff));
> >
>
> I never knew this was meant to show the concept/typing relationship
> between gpa_t and byte offset, gfn_t and page offset. Can we add a
> comment to explain the presence of BUILD_BUG_ON()?

Eh, I'd honestly rather not. I agree that on the whole, the kernel tends to be
far too stingy with comments. However, at some point a baseline of knowledge is
required. For me, understanding trunctation falls well below that line. It
does require the reader to know that guest_memfd bindings are gpa:offset, but
that too is core/basic knowledge that the reader needs to have to make any sense
of this code.

> Also, what's the rationale for picking BUILD_BUG_ON() over
> static_assert()?

https://lore.kernel.org/all/aI05DvQlMWJXewUi@xxxxxxxxxx

> static_assert() could be placed outside block scope which makes this
> relationship declaration more general/global. Putting it in the function
> makes it seem local (seems to suggest trying to assert some guard for just
> this function).

Well, yeah, because it is local. _This_ code relies on the sizes being the same,
otherwise we'd have to deal with potential truncation, whereas overall KVM does
NOT. The purpose of adding asserts like this is to document an assumption and
guard against future breakage.

Just because we _can_ put the assertion in a global location doesn't mean we
should.

KVM does use static_assert(). But without exception (at least in x86, haven't
checked other architectures), every usage is still co-located with the exact code
that relies on the assumption that's guarded by the assertion.