Re: [PATCH v11 4/4] rust: gpu: Add GPU buddy allocator bindings

From: Alexandre Courbot

Date: Wed Feb 25 2026 - 09:42:47 EST


On Wed Feb 25, 2026 at 7:40 AM JST, Joel Fernandes wrote:
> Add safe Rust abstractions over the Linux kernel's GPU buddy
> allocator for physical memory management. The GPU buddy allocator
> implements a binary buddy system useful for GPU physical memory
> allocation. nova-core will use it for physical memory allocation.
>
> Cc: Nikola Djukic <ndjukic@xxxxxxxxxx>
> Signed-off-by: Joel Fernandes <joelagnelf@xxxxxxxxxx>

<snip>
> diff --git a/rust/kernel/gpu/buddy.rs b/rust/kernel/gpu/buddy.rs
> new file mode 100644
> index 000000000000..4383f82c0fc1
> --- /dev/null
> +++ b/rust/kernel/gpu/buddy.rs
> @@ -0,0 +1,536 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! GPU buddy allocator bindings.
> +//!
> +//! C header: [`include/linux/gpu_buddy.h`](srctree/include/linux/gpu_buddy.h)
> +//!
> +//! This module provides Rust abstractions over the Linux kernel's GPU buddy
> +//! allocator, which implements a binary buddy memory allocator.
> +//!
> +//! The buddy allocator manages a contiguous address space and allocates blocks
> +//! in power-of-two sizes, useful for GPU physical memory management.
> +//!
> +//! # Examples
> +//!
> +//! ```

This is a very long example illustrating many use-cases. It is long
enough that it is difficult to grasp where each example start. Can I
suggest to aerate it a bit by splitting it into several examples, with a
bit of regular text explaining what each example does, similarly to the
documentation of the `Bounded` type?

You can hide the creation of the `GpuBuddy` after the first example to
keep things concise.

> +//! use kernel::{
> +//! gpu::buddy::{BuddyFlags, GpuBuddy, GpuBuddyAllocParams, GpuBuddyParams},
> +//! prelude::*,
> +//! sizes::*, //
> +//! };
> +//!
> +//! // Create a 1GB buddy allocator with 4KB minimum chunk size.
> +//! let buddy = GpuBuddy::new(GpuBuddyParams {
> +//! base_offset_bytes: 0,
> +//! physical_memory_size_bytes: SZ_1G as u64,
> +//! chunk_size_bytes: SZ_4K as u64,
> +//! })?;
> +//!
> +//! // Verify initial state.
> +//! assert_eq!(buddy.size(), SZ_1G as u64);
> +//! assert_eq!(buddy.chunk_size(), SZ_4K as u64);
> +//! let initial_free = buddy.free_memory_bytes();
> +//!
> +//! // Base allocation params - mutated between calls for field overrides.
> +//! let mut params = GpuBuddyAllocParams {
> +//! start_range_address: 0,
> +//! end_range_address: 0, // Entire range.
> +//! size_bytes: SZ_16M as u64,
> +//! min_block_size_bytes: SZ_16M as u64,
> +//! buddy_flags: BuddyFlags::try_new(BuddyFlags::RANGE_ALLOCATION)?,

Why can a `BuddyFlags` creation fail if we give it a valid value? It
looks like its consts should be of the type `BuddyFlags` themselves so
we can use them directly. Actually, we should probably use `impl_flags!`
for it.

> +//! };
> +//!
> +//! // Test top-down allocation (allocates from highest addresses).
> +//! params.buddy_flags = BuddyFlags::try_new(BuddyFlags::TOPDOWN_ALLOCATION)?;
> +//! let topdown = KBox::pin_init(buddy.alloc_blocks(&params), GFP_KERNEL)?;
> +//! assert_eq!(buddy.free_memory_bytes(), initial_free - SZ_16M as u64);
> +//!
> +//! for block in topdown.iter() {
> +//! assert_eq!(block.offset(), (SZ_1G - SZ_16M) as u64);
> +//! assert_eq!(block.order(), 12); // 2^12 pages
> +//! assert_eq!(block.size(), SZ_16M as u64);
> +//! }

IIUC there should be only one block here, right? That for loop should be
replaced by a call to `next()` followed by another one checking that the
result is `None` to be a valid test.

> +//! drop(topdown);

Here is a good chance to mention that dropping the allocation will
return it - it's expected, but not entirely obvious when you read this
for the first time.

> +//! assert_eq!(buddy.free_memory_bytes(), initial_free);
> +//!
> +//! // Allocate 16MB - should result in a single 16MB block at offset 0.
> +//! params.buddy_flags = BuddyFlags::try_new(BuddyFlags::RANGE_ALLOCATION)?;

Let's recreate the params for each example to make it self-contained
instead of modifying the first one.

> +//! let allocated = KBox::pin_init(buddy.alloc_blocks(&params), GFP_KERNEL)?;
> +//! assert_eq!(buddy.free_memory_bytes(), initial_free - SZ_16M as u64);
> +//!
> +//! for block in allocated.iter() {
> +//! assert_eq!(block.offset(), 0);
> +//! assert_eq!(block.order(), 12); // 2^12 pages
> +//! assert_eq!(block.size(), SZ_16M as u64);
> +//! }
> +//! drop(allocated);
> +//! assert_eq!(buddy.free_memory_bytes(), initial_free);
> +//!
> +//! // Test non-contiguous allocation with fragmented memory.
> +//! // Create fragmentation by allocating 4MB blocks at [0,4M) and [8M,12M).
> +//! params.end_range_address = SZ_4M as u64;
> +//! params.size_bytes = SZ_4M as u64;
> +//! params.min_block_size_bytes = SZ_4M as u64;
> +//! let frag1 = KBox::pin_init(buddy.alloc_blocks(&params), GFP_KERNEL)?;
> +//! assert_eq!(buddy.free_memory_bytes(), initial_free - SZ_4M as u64);
> +//!
> +//! params.start_range_address = SZ_8M as u64;
> +//! params.end_range_address = (SZ_8M + SZ_4M) as u64;
> +//! let frag2 = KBox::pin_init(buddy.alloc_blocks(&params), GFP_KERNEL)?;
> +//! assert_eq!(buddy.free_memory_bytes(), initial_free - SZ_8M as u64);
> +//!
> +//! // Allocate 8MB without CONTIGUOUS - should return 2 blocks from the holes.
> +//! params.start_range_address = 0;
> +//! params.end_range_address = SZ_16M as u64;
> +//! params.size_bytes = SZ_8M as u64;
> +//! let fragmented = KBox::pin_init(buddy.alloc_blocks(&params), GFP_KERNEL)?;
> +//! assert_eq!(buddy.free_memory_bytes(), initial_free - (SZ_16M) as u64);
> +//!
> +//! let (mut count, mut total) = (0u32, 0u64);
> +//! for block in fragmented.iter() {
> +//! // The 8MB allocation should return 2 blocks, each 4MB.
> +//! assert_eq!(block.size(), SZ_4M as u64);
> +//! total += block.size();
> +//! count += 1;
> +//! }
> +//! assert_eq!(total, SZ_8M as u64);
> +//! assert_eq!(count, 2);
> +//! drop(fragmented);
> +//! drop(frag2);
> +//! drop(frag1);
> +//! assert_eq!(buddy.free_memory_bytes(), initial_free);
> +//!
> +//! // Test CONTIGUOUS failure when only fragmented space available.
> +//! // Create a small buddy allocator with only 16MB of memory.
> +//! let small = GpuBuddy::new(GpuBuddyParams {
> +//! base_offset_bytes: 0,
> +//! physical_memory_size_bytes: SZ_16M as u64,
> +//! chunk_size_bytes: SZ_4K as u64,
> +//! })?;
> +//!
> +//! // Allocate 4MB blocks at [0,4M) and [8M,12M) to create fragmented memory.
> +//! params.start_range_address = 0;
> +//! params.end_range_address = SZ_4M as u64;
> +//! params.size_bytes = SZ_4M as u64;
> +//! let hole1 = KBox::pin_init(small.alloc_blocks(&params), GFP_KERNEL)?;
> +//!
> +//! params.start_range_address = SZ_8M as u64;
> +//! params.end_range_address = (SZ_8M + SZ_4M) as u64;
> +//! let hole2 = KBox::pin_init(small.alloc_blocks(&params), GFP_KERNEL)?;
> +//!
> +//! // 8MB contiguous should fail - only two non-contiguous 4MB holes exist.
> +//! params.start_range_address = 0;
> +//! params.end_range_address = 0;
> +//! params.size_bytes = SZ_8M as u64;
> +//! params.buddy_flags = BuddyFlags::try_new(BuddyFlags::CONTIGUOUS_ALLOCATION)?;
> +//! let result = KBox::pin_init(small.alloc_blocks(&params), GFP_KERNEL);
> +//! assert!(result.is_err());
> +//! drop(hole2);
> +//! drop(hole1);
> +//!
> +//! # Ok::<(), Error>(())
> +//! ```
> +
> +use crate::{
> + bindings,
> + clist_create,
> + error::to_result,
> + ffi::clist::CListHead,
> + new_mutex,
> + prelude::*,
> + sync::{
> + lock::mutex::MutexGuard,
> + Arc,
> + Mutex, //
> + },
> + types::Opaque, //
> +};
> +
> +/// Flags for GPU buddy allocator operations.
> +///
> +/// These flags control the allocation behavior of the buddy allocator.
> +#[derive(Clone, Copy, Default, PartialEq, Eq)]
> +pub struct BuddyFlags(usize);
> +
> +impl BuddyFlags {
> + /// Range-based allocation from start to end addresses.
> + pub const RANGE_ALLOCATION: usize = bindings::GPU_BUDDY_RANGE_ALLOCATION;
> +
> + /// Allocate from top of address space downward.
> + pub const TOPDOWN_ALLOCATION: usize = bindings::GPU_BUDDY_TOPDOWN_ALLOCATION;
> +
> + /// Allocate physically contiguous blocks.
> + pub const CONTIGUOUS_ALLOCATION: usize = bindings::GPU_BUDDY_CONTIGUOUS_ALLOCATION;
> +
> + /// Request allocation from the cleared (zeroed) memory. The zero'ing is not
> + /// done by the allocator, but by the caller before freeing old blocks.
> + pub const CLEAR_ALLOCATION: usize = bindings::GPU_BUDDY_CLEAR_ALLOCATION;
> +
> + /// Disable trimming of partially used blocks.
> + pub const TRIM_DISABLE: usize = bindings::GPU_BUDDY_TRIM_DISABLE;
> +
> + /// Mark blocks as cleared (zeroed) when freeing. When set during free,
> + /// indicates that the caller has already zeroed the memory.
> + pub const CLEARED: usize = bindings::GPU_BUDDY_CLEARED;
> +
> + /// Create [`BuddyFlags`] from a raw value with validation.
> + ///
> + /// Use `|` operator to combine flags if needed, before calling this method.
> + pub fn try_new(flags: usize) -> Result<Self> {
> + // Flags must not exceed u32::MAX to satisfy the GPU buddy allocator C API.
> + if flags > u32::MAX as usize {

These `as` conversions are unfortunate - I will try to graduate the
infallible convertors from Nova into kernel soon so we can avoid them,
but for now I guess there's nothing we can do unfortunately.

> + return Err(EINVAL);
> + }
> +
> + // `TOPDOWN_ALLOCATION` only works without `RANGE_ALLOCATION`. When both are
> + // set, `TOPDOWN_ALLOCATION` is silently ignored by the allocator. Reject this.
> + if (flags & Self::RANGE_ALLOCATION) != 0 && (flags & Self::TOPDOWN_ALLOCATION) != 0 {
> + return Err(EINVAL);
> + }

This indicates that we should use the type system to prevent such
constructs from even being attempted - more on this on
`GpuBuddyAllocParams`.

> +
> + Ok(Self(flags))
> + }
> +
> + /// Get raw value of the flags.
> + pub(crate) fn as_raw(self) -> usize {
> + self.0
> + }
> +}
> +
> +/// Parameters for creating a GPU buddy allocator.
> +pub struct GpuBuddyParams {
> + /// Base offset in bytes where the managed memory region starts.
> + /// Allocations will be offset by this value.
> + pub base_offset_bytes: u64,

Let's remove the `_bytes` suffix. Units can be specified in the
doccomment so they are readily available without making the code
heavier (`dma.rs` for instance does this).

> + /// Total physical memory size managed by the allocator in bytes.
> + pub physical_memory_size_bytes: u64,
> + /// Minimum allocation unit / chunk size in bytes, must be >= 4KB.
> + pub chunk_size_bytes: u64,
> +}

This structure doesn't seem to be useful. I would understand using one
if `GpuBuddyParams` had lots of members, some of which have a sensible
default value - then we could implement `Default` and let users fill in
the parameters they need.

But this structure has no constructor of any sort, requiring users to
fill its 3 members manually - which is actually heavier than having 3
parameters to `GpuBuddy::new`. It is even deconstructed in
`GpuBuddyInner` to store its members as 3 different fields! So let's
skip it.

> +
> +/// Parameters for allocating blocks from a GPU buddy allocator.
> +pub struct GpuBuddyAllocParams {
> + /// Start of allocation range in bytes. Use 0 for beginning.
> + pub start_range_address: u64,
> + /// End of allocation range in bytes. Use 0 for entire range.
> + pub end_range_address: u64,
> + /// Total size to allocate in bytes.
> + pub size_bytes: u64,
> + /// Minimum block size for fragmented allocations in bytes.
> + pub min_block_size_bytes: u64,
> + /// Buddy allocator behavior flags.
> + pub buddy_flags: BuddyFlags,
> +}

This one also feels like it could be rustified some more.

By this I mean that it e.g. allows the user to specify a range even if
`RANGE_ALLOCATION` is not set. A C API rejects invalid combinations at
runtime. A Rust API should make it impossible to even express them.

IIUC the flags mix the allocation type (simple, range, topdown) with
some orthogonal properties (contiguous, cleared, trim_disable). There is
also one bit (`GPU_BUDDY_CLEARED`) that is not relevant for allocation,
but is freeing a block. We want to use the type system to only allow states
that make sense to be constructed.

The allocation type can be expressed using a three-state enum. Start and
end range only make sense for range allocations, so they would be part
of the `Range` variant.

`size` and `min_block_size` (let's move the unit from the name into the
doccomment) are always relevant, and should be regular arguments of
their own. If `min_block_sizes` only accepts certain values (the C
documentation says "alignment", so power of two?) then it should also
use the relevant type for that (our own `kernel::ptr::Alignment` type?).

`flags` should be its own type (using `impl_flags` again?) allowing a
combination of Contiguous, Cleared, and TrimDisable.

That would turn `alloc_blocks` into something like:

`fn alloc_blocks(&self, alloc: AllocType, size: u64, min_block_size: Alignment, flags: AllocBlocksFlags)`

(`min_block_size` is a u64 in the C API, but for an alignment a `usize`
is enough and the conversion can be lossless).

> +
> +/// Inner structure holding the actual buddy allocator.
> +///
> +/// # Synchronization
> +///
> +/// The C `gpu_buddy` API requires synchronization (see `include/linux/gpu_buddy.h`).
> +/// The internal [`GpuBuddyGuard`] ensures that the lock is held for all

`GpuBuddyGuard` is exported and not internal though.

> +/// allocator and free operations, preventing races between concurrent allocations
> +/// and the freeing that occurs when [`AllocatedBlocks`] is dropped.
> +///
> +/// # Invariants
> +///
> +/// The inner [`Opaque`] contains a valid, initialized buddy allocator.
> +#[pin_data(PinnedDrop)]
> +struct GpuBuddyInner {
> + #[pin]
> + inner: Opaque<bindings::gpu_buddy>,
> +
> + // TODO: Replace `Mutex<()>` with `Mutex<Opaque<..>>` once `Mutex::new()`
> + // accepts `impl PinInit<T>`.
> + #[pin]
> + lock: Mutex<()>,
> + /// Base offset for all allocations (does not change after init).
> + base_offset: u64,

This does not appear to be used in the C API - does it belong here? It
looks like an additional convenience, but I'm not convinced that's the
role of this type to provide this. But if it really is needed by all
users (guess I'll find out after looking the Nova code :)), then keeping
it is fair I guess.

> + /// Cached chunk size (does not change after init).
> + chunk_size: u64,
> + /// Cached total size (does not change after init).
> + size: u64,
> +}
> +
> +impl GpuBuddyInner {
> + /// Create a pin-initializer for the buddy allocator.
> + fn new(params: GpuBuddyParams) -> impl PinInit<Self, Error> {
> + let base_offset = params.base_offset_bytes;
> + let size = params.physical_memory_size_bytes;
> + let chunk_size = params.chunk_size_bytes;
> +
> + try_pin_init!(Self {
> + inner <- Opaque::try_ffi_init(|ptr| {
> + // SAFETY: ptr points to valid uninitialized memory from the pin-init
> + // infrastructure. gpu_buddy_init will initialize the structure.
> + to_result(unsafe { bindings::gpu_buddy_init(ptr, size, chunk_size) })
> + }),
> + lock <- new_mutex!(()),
> + base_offset: base_offset,
> + chunk_size: chunk_size,
> + size: size,
> + })
> + }
> +
> + /// Lock the mutex and return a guard for accessing the allocator.
> + fn lock(&self) -> GpuBuddyGuard<'_> {
> + GpuBuddyGuard {
> + inner: self,
> + _guard: self.lock.lock(),
> + }
> + }
> +}
> +
> +#[pinned_drop]
> +impl PinnedDrop for GpuBuddyInner {
> + fn drop(self: Pin<&mut Self>) {
> + let guard = self.lock();
> +
> + // SAFETY: guard provides exclusive access to the allocator.
> + unsafe {
> + bindings::gpu_buddy_fini(guard.as_raw());
> + }
> + }
> +}
> +
> +// SAFETY: GpuBuddyInner can be sent between threads.
> +unsafe impl Send for GpuBuddyInner {}
> +
> +// SAFETY: GpuBuddyInner is `Sync` because the internal GpuBuddyGuard
> +// serializes all access to the C allocator, preventing data races.
> +unsafe impl Sync for GpuBuddyInner {}
> +
> +/// Guard that proves the lock is held, enabling access to the allocator.
> +///
> +/// # Invariants
> +///
> +/// The inner `_guard` holds the lock for the duration of this guard's lifetime.

Private members should not be mentioned in public documentation. Also is
this invariant ever referenced when enforced and to justify an unsafe
block? If not I don't think there is a point in having it.

> +pub(crate) struct GpuBuddyGuard<'a> {

IIUC this type can be private.

> + inner: &'a GpuBuddyInner,
> + _guard: MutexGuard<'a, ()>,
> +}
> +
> +impl GpuBuddyGuard<'_> {
> + /// Get a raw pointer to the underlying C `gpu_buddy` structure.
> + fn as_raw(&self) -> *mut bindings::gpu_buddy {
> + self.inner.inner.get()
> + }
> +}
> +
> +/// GPU buddy allocator instance.
> +///
> +/// This structure wraps the C `gpu_buddy` allocator using reference counting.
> +/// The allocator is automatically cleaned up when all references are dropped.
> +///
> +/// # Invariants
> +///
> +/// The inner [`Arc`] points to a valid, initialized GPU buddy allocator.

Most people looking for the documentation will reach it through
`GpuBuddy`. I think we should either move the module-level documentation
to this type, or add a reference to the module so users can easily find
how to use it.

> +pub struct GpuBuddy(Arc<GpuBuddyInner>);
> +
> +impl GpuBuddy {
> + /// Create a new buddy allocator.
> + ///
> + /// Creates a buddy allocator that manages a contiguous address space of the given
> + /// size, with the specified minimum allocation unit (chunk_size must be at least 4KB).
> + pub fn new(params: GpuBuddyParams) -> Result<Self> {
> + Ok(Self(Arc::pin_init(GpuBuddyInner::new(params), GFP_KERNEL)?))
> + }
> +
> + /// Get the base offset for allocations.
> + pub fn base_offset(&self) -> u64 {
> + self.0.base_offset
> + }
> +
> + /// Get the chunk size (minimum allocation unit).
> + pub fn chunk_size(&self) -> u64 {
> + self.0.chunk_size
> + }
> +
> + /// Get the total managed size.
> + pub fn size(&self) -> u64 {
> + self.0.size
> + }
> +
> + /// Get the available (free) memory in bytes.
> + pub fn free_memory_bytes(&self) -> u64 {

Same as struct members, the unit doesn't need to be in the method name -
the doccomment is sufficient.

> + let guard = self.0.lock();
> +
> + // SAFETY: guard provides exclusive access to the allocator.
> + unsafe { (*guard.as_raw()).avail }
> + }
> +
> + /// Allocate blocks from the buddy allocator.
> + ///
> + /// Returns a pin-initializer for [`AllocatedBlocks`].
> + ///
> + /// Takes `&self` instead of `&mut self` because the internal [`Mutex`] provides
> + /// synchronization - no external `&mut` exclusivity needed.
> + pub fn alloc_blocks(
> + &self,
> + params: &GpuBuddyAllocParams,
> + ) -> impl PinInit<AllocatedBlocks, Error> {
> + let buddy_arc = Arc::clone(&self.0);
> + let start = params.start_range_address;
> + let end = params.end_range_address;
> + let size = params.size_bytes;
> + let min_block_size = params.min_block_size_bytes;
> + let flags = params.buddy_flags;

These local variables are required so the closure below is not confused
by the lifetime of `params`. But since you are copying its content
anyway, you could just make `GpuBuddyAllocParams` derive `Copy`, pass
`params` by value, and use its members directly in the closure.

That probably won't be needed if we split `GpuBuddyAllocParams` as I
suggested though.

> +
> + // Create pin-initializer that initializes list and allocates blocks.
> + try_pin_init!(AllocatedBlocks {
> + buddy: buddy_arc,
> + list <- CListHead::new(),
> + flags: flags,
> + _: {
> + // Lock while allocating to serialize with concurrent frees.
> + let guard = buddy.lock();
> +
> + // SAFETY: `guard` provides exclusive access to the buddy allocator.
> + to_result(unsafe {
> + bindings::gpu_buddy_alloc_blocks(
> + guard.as_raw(),
> + start,
> + end,
> + size,
> + min_block_size,
> + list.as_raw(),
> + flags.as_raw(),
> + )
> + })?
> + }
> + })
> + }
> +}
> +
> +/// Allocated blocks from the buddy allocator with automatic cleanup.
> +///
> +/// This structure owns a list of allocated blocks and ensures they are
> +/// automatically freed when dropped. Use `iter()` to iterate over all
> +/// allocated [`Block`] structures.
> +///
> +/// # Invariants
> +///
> +/// - `list` is an initialized, valid list head containing allocated blocks.
> +#[pin_data(PinnedDrop)]
> +pub struct AllocatedBlocks {
> + #[pin]
> + list: CListHead,
> + buddy: Arc<GpuBuddyInner>,
> + flags: BuddyFlags,
> +}
> +
> +impl AllocatedBlocks {
> + /// Check if the block list is empty.
> + pub fn is_empty(&self) -> bool {
> + // An empty list head points to itself.
> + !self.list.is_linked()
> + }
> +
> + /// Iterate over allocated blocks.
> + ///
> + /// Returns an iterator yielding [`AllocatedBlock`] values. Each [`AllocatedBlock`]
> + /// borrows `self` and is only valid for the duration of that borrow.
> + pub fn iter(&self) -> impl Iterator<Item = AllocatedBlock<'_>> + '_ {
> + // SAFETY: list contains gpu_buddy_block items linked via __bindgen_anon_1.link.

IIUC the type invariant should be invoked explicitly as we are using it
to justify this unsafe block (i.e. "per the type invariant, ...").

> + let clist = clist_create!(unsafe {
> + self.list.as_raw(),
> + Block,
> + bindings::gpu_buddy_block,
> + __bindgen_anon_1.link
> + });
> +
> + clist
> + .iter()
> + .map(|block| AllocatedBlock { block, alloc: self })
> + }
> +}
> +
> +#[pinned_drop]
> +impl PinnedDrop for AllocatedBlocks {
> + fn drop(self: Pin<&mut Self>) {
> + let guard = self.buddy.lock();
> +
> + // SAFETY:
> + // - list is valid per the type's invariants.
> + // - guard provides exclusive access to the allocator.
> + // CAST: BuddyFlags were validated to fit in u32 at construction.
> + unsafe {
> + bindings::gpu_buddy_free_list(
> + guard.as_raw(),
> + self.list.as_raw(),
> + self.flags.as_raw() as u32,

You won't need this `as` if you make `BuddyFlags` wrap a `u32` instead
of a `usize`.

> + );
> + }
> + }
> +}
> +
> +/// A GPU buddy block.
> +///
> +/// Transparent wrapper over C `gpu_buddy_block` structure. This type is returned
> +/// as references during iteration over [`AllocatedBlocks`].
> +///
> +/// # Invariants
> +///
> +/// The inner [`Opaque`] contains a valid, allocated `gpu_buddy_block`.
> +#[repr(transparent)]
> +pub struct Block(Opaque<bindings::gpu_buddy_block>);
> +
> +impl Block {
> + /// Get a raw pointer to the underlying C block.
> + fn as_raw(&self) -> *mut bindings::gpu_buddy_block {
> + self.0.get()
> + }
> +
> + /// Get the block's offset in the address space.
> + pub(crate) fn offset(&self) -> u64 {
> + // SAFETY: self.as_raw() is valid per the type's invariants.
> + unsafe { bindings::gpu_buddy_block_offset(self.as_raw()) }
> + }
> +
> + /// Get the block order.
> + pub(crate) fn order(&self) -> u32 {
> + // SAFETY: self.as_raw() is valid per the type's invariants.
> + unsafe { bindings::gpu_buddy_block_order(self.as_raw()) }
> + }
> +}
> +
> +// SAFETY: `Block` is not modified after allocation for the lifetime
> +// of `AllocatedBlock`.

This safety comment should not need to reference another type - how
about something like "`Block` is a wrapper around `gpu_buddy_block`
which can be sent across threads safely".

> +unsafe impl Send for Block {}
> +
> +// SAFETY: `Block` is not modified after allocation for the lifetime
> +// of `AllocatedBlock`.

Here as well. I'd say that the block is only accessed through shared
references after allocation, and thus safe to access concurrently across
threads.