Re: [PATCH v11 4/4] rust: gpu: Add GPU buddy allocator bindings

From: Joel Fernandes

Date: Wed Feb 25 2026 - 15:41:51 EST


Hi Alex,

On Wed, Feb 25, 2026 at 11:38:31PM +0900, Alexandre Courbot wrote:
>> +//! # Examples
>> +//!
>> +//! ``` [..]
>
> This is a very long example illustrating many use-cases. It is long
> enough that it is difficult to grasp where each example start. Can I
> suggest to aerate it a bit by splitting it into several examples, with a
> bit of regular text explaining what each example does, similarly to the
> documentation of the `Bounded` type?
>
> You can hide the creation of the `GpuBuddy` after the first example to
> keep things concise.

Done. Split into four separate examples with descriptive text between
them. Subsequent examples hide imports and buddy creation with # lines,
and adjust based on your other suggestions.

>> +//! buddy_flags: BuddyFlags::try_new(BuddyFlags::RANGE_ALLOCATION)?,
>
> Why can a `BuddyFlags` creation fail if we give it a valid value? It
> looks like its consts should be of the type `BuddyFlags` themselves so
> we can use them directly. Actually, we should probably use `impl_flags!`
> for it.

Good point. Switched to `impl_flags!`. I made it wrap `u32`
and individual flags are `BuddyFlag` enum variants. Construction is
infallible. Invalid combinations will be caught by the C allocator
at alloc time.

>> +//! for block in topdown.iter() {
>> +//! assert_eq!(block.offset(), (SZ_1G - SZ_16M) as u64);
>> +//! assert_eq!(block.order(), 12); // 2^12 pages
>> +//! assert_eq!(block.size(), SZ_16M as u64);
>> +//! }
>
> IIUC there should be only one block here, right? That for loop should be
> replaced by a call to `next()` followed by another one checking that the
> result is `None` to be a valid test.

Ah, good point, fixed as follows:
let block = topdown.iter().next().expect("expected one block");
assert_eq!(block.offset(), (SZ_1G - SZ_16M) as u64);
assert_eq!(block.order(), 12);
assert_eq!(block.size(), SZ_16M as u64);

>> +//! drop(topdown);
>
> Here is a good chance to mention that dropping the allocation will
> return it - it's expected, but not entirely obvious when you read this
> for the first time.

Added a comment: "Dropping the allocation returns the memory to the
buddy allocator."

>> +//! params.buddy_flags = BuddyFlags::try_new(BuddyFlags::RANGE_ALLOCATION)?;
>
> Let's recreate the params for each example to make it self-contained
> instead of modifying the first one.

Done, each example now creates its own self-contained params.

>> + if flags > u32::MAX as usize {
>
> These `as` conversions are unfortunate - I will try to graduate the
> infallible convertors from Nova into kernel soon so we can avoid them,
> but for now I guess there's nothing we can do unfortunately.

Since I switched to `impl_flags!` with `u32`, the `u32::MAX` check
is gone.

>> + if (flags & Self::RANGE_ALLOCATION) != 0 && (flags & Self::TOPDOWN_ALLOCATION) != 0 {
>> + return Err(EINVAL);
>> + }
>
> This indicates that we should use the type system to prevent such
> constructs from even being attempted - more on this on
> `GpuBuddyAllocParams`.

The C API supports flag combinations (e.g. RANGE+CLEAR, TOPDOWN+CLEAR),
so we model flags as combinable bitflags via `impl_flags!` as you suggested,
rather than a mutually-exclusive enum. Invalid combinations are caught by the C
allocator at runtime. Also fixed a bug here: RANGE+TOPDOWN is valid in C
(TOPDOWN is just unused in the range path), so the old rejection was wrong.
Removed it.

>> + pub base_offset_bytes: u64,
>
> Let's remove the `_bytes` suffix. Units can be specified in the
> doccomment so they are readily available without making the code
> heavier (`dma.rs` for instance does this).

Done.

>> +pub struct GpuBuddyParams {
>
> This structure doesn't seem to be useful. I would understand using one
> if `GpuBuddyParams` had lots of members, some of which have a sensible
> default value - then we could implement `Default` and let users fill in
> the parameters they need.
>
> But this structure has no constructor of any sort, requiring users to
> fill its 3 members manually - which is actually heavier than having 3
> parameters to `GpuBuddy::new`. It is even deconstructed in
> `GpuBuddyInner` to store its members as 3 different fields! So let's
> skip it.

I'd prefer to keep the struct -- all three parameters are `u64`, so
positional arguments would be easy to silently misorder. The struct
also makes call sites more readable since Rust has no named function call
parameters.

>> +pub struct GpuBuddyAllocParams {
>
> This one also feels like it could be rustified some more.
>
> By this I mean that it e.g. allows the user to specify a range even if
> `RANGE_ALLOCATION` is not set. A C API rejects invalid combinations at
> runtime. A Rust API should make it impossible to even express them.
>
> [...]
>
> That would turn `alloc_blocks` into something like:
>
> `fn alloc_blocks(&self, alloc: AllocType, size: u64, min_block_size: Alignment, flags: AllocBlocksFlags)`

The C API supports combining allocation modes with modifiers (e.g.
RANGE+CLEAR, TOPDOWN+CLEAR), so modeling the mode as a
mutually-exclusive enum would lose valid combinations. More importantly,
if the C allocator evolves its flag semantics (new combinations become
valid, or existing constraints change), an enum on the Rust side would
break. It's simpler and more maintainable to pass combinable flags and
let the C allocator validate -- which it already does. The switch to
`impl_flags!` will work for us without over-constraining.

>> +/// The internal [`GpuBuddyGuard`] ensures that the lock is held for all
>
> `GpuBuddyGuard` is exported and not internal though.

Fixed, removed "internal" wording.

>> + base_offset: u64,
>
> This does not appear to be used in the C API - does it belong here? It
> looks like an additional convenience, but I'm not convinced that's the
> role of this type to provide this. But if it really is needed by all
> users (guess I'll find out after looking the Nova code :)), then keeping
> it is fair I guess.

Yes, `base_offset` is needed by nova-core. The GPU's usable VRAM
starts at `usable_vram_start` from the GSP firmware parameters:

GpuBuddyParams {
base_offset: params.usable_vram_start,
physical_memory_size: params.usable_vram_size,
chunk_size: SZ_4K.into_safe_cast(),
}

`AllocatedBlock::offset()` then adds `base_offset` to return absolute
VRAM addresses, so callers don't need to track the offset themselves.

>> +/// # Invariants
>> +///
>> +/// The inner `_guard` holds the lock for the duration of this guard's lifetime.
>
> Private members should not be mentioned in public documentation. Also is
> this invariant ever referenced when enforced and to justify an unsafe
> block? If not I don't think there is a point in having it.
>>
>> +pub(crate) struct GpuBuddyGuard<'a> {
>
> IIUC this type can be private.

Done, made `GpuBuddyGuard` private and converted to a regular `//` comment.

>> + pub fn free_memory_bytes(&self) -> u64 {
>
> Same as struct members, the unit doesn't need to be in the method name -
> the doccomment is sufficient.

Renamed.

>> +pub struct GpuBuddy(Arc<GpuBuddyInner>);
>
> Most people looking for the documentation will reach it through
> `GpuBuddy`. I think we should either move the module-level documentation
> to this type, or add a reference to the module so users can easily find
> how to use it.

Ok, I refer to the module-level doc on the struct.

>> + let start = params.start_range_address;
>> + let end = params.end_range_address;
>> + let size = params.size_bytes;
>> + let min_block_size = params.min_block_size_bytes;
>> + let flags = params.buddy_flags;
>
> These local variables are required so the closure below is not confused
> by the lifetime of `params`. But since you are copying its content
> anyway, you could just make `GpuBuddyAllocParams` derive `Copy`, pass
> `params` by value, and use its members directly in the closure.

What I will do is just pass it by value (instead of the reference that I'm
currently passing), and then let the compiler decide if it needs to make copies
or not. In the future, if we change it to not making copies inside the function,
then it will just fallback to a non-copy move. However, if implemented as copy
trait, it might always be copied.

Actually, it turns out that when I pass it by value, I can also get rid of the
intermediate variables so this is great and has the effect you were intending!

>> + // SAFETY: list contains gpu_buddy_block items linked via __bindgen_anon_1.link.
>
> IIUC the type invariant should be invoked explicitly as we are using it
> to justify this unsafe block (i.e. "per the type invariant, ...").

Fixed.

>> + self.flags.as_raw() as u32,
>
> You won't need this `as` if you make `BuddyFlags` wrap a `u32` instead
> of a `usize`.

Yes! Done.

>> +// SAFETY: `Block` is not modified after allocation for the lifetime
>> +// of `AllocatedBlock`.
>> +unsafe impl Send for Block {}
>
> This safety comment should not need to reference another type - how
> about something like "`Block` is a wrapper around `gpu_buddy_block`
> which can be sent across threads safely".

Fixed.

>> +// SAFETY: `Block` is not modified after allocation for the lifetime
>> +// of `AllocatedBlock`.
>> +unsafe impl Sync for Block {}
>
> Here as well. I'd say that the block is only accessed through shared
> references after allocation, and thus safe to access concurrently across
> threads.

Fixed.

Thanks for the thorough review! Looks a lot better now, and I am looking forward
to sending the next revision soon.

--
Joel Fernandes