Re: [PATCH v11 00/22] gpu: nova-core: firmware: Hopper/Blackwell support

From: John Hubbard

Date: Fri May 29 2026 - 23:30:36 EST


Shoot, I seem to have used the older, now-wrong script to send this, because
I've only Cc'd rust-for-linux, and left out our nice new nova-gpu@xxxxxxxxxxxxxxx
mailing list.

I'll +Cc nova-gpu@xxxxxxxxxxxxxxx just on this cover letter, for now. I hope
this didn't mess up anyone's inbox too badly.

thanks,
John Hubbard

On 5/29/26 8:09 PM, John Hubbard wrote:
> Changes in v11:
>
> * Made the FSP messaging path sound. The FSP falcon's EMEM window is a
> stateful register pair (program an offset, then touch the data
> register), so modeling it as a stateless I/O region let aliasing
> accesses corrupt each other's offset with no unsafe at the call site.
> The EMEM accessors and the send/receive helpers now take &mut self, so
> the falcon handle is the exclusive token for an in-flight exchange,
> and the unsafe Io/IoCapable impls and their unreachable! bounds checks
> are gone. The accessors now program the EMEM offset once and stream
> through the data register using the falcon's auto-increment, matching
> Open RM, instead of re-programming the offset for every word.
>
> * Rebased onto a current drm-rust-next that already carries the v10
> preparatory patches, which are dropped from the series.
>
> * Top of the series: the v10 boot-integration patch is replaced by "gsp:
> enable FSP boot path" (Alexandre Courbot) and "add non-sec2 unload
> path" (Eliot Courtney). The Hopper/Blackwell boot path now lives in
> the GSP HAL (gsp/hal/gh100.rs) and returns a BootUnloadGuard.
>
> * Reordered per review: hardware-differences patches first (DMA mask,
> PCI config mirror, PMU-reserved framebuffer, non-WPR heap, WPR2 heap,
> sysmem flush registers), then the FSP/FMC stack, then GSP lockdown
> release polling.
>
> * Hardware-difference patches are now HAL methods instead of inline
> Architecture matches: the PMU-reserved framebuffer size (patch
> retitled from "calculate reserved FB heap size" to "compute
> PMU-reserved framebuffer size"), the non-WPR heap size (now u32 with a
> 1 MiB default instead of Option<u32>, per v10 review, with the GB10x
> value in the GB100 HAL and the larger GB20x value in the GB202 HAL),
> and the PCI config mirror range. The larger WPR2 heap pulls its base
> size from the generated bindings, drops the custom constants that have
> no Open RM counterpart, and matches all architectures exhaustively.
>
> * FSP firmware handling moved into firmware/fsp.rs: FspFirmware now
> holds parsed signatures (KBox<FmcSignatures>) instead of a raw ELF
> copy, extracted through a get_section closure (per v10 review).
>
> * FSP secure-boot polling uses a per-chipset FSP HAL
> (fsp/hal/{gh100,gb202}.rs) reading the correct NV_THERM_I2CS register,
> instead of a free function in regs.rs.
>
> * FSP Chain of Trust boot was redone around a new FmcBootArgs type, and
> the response headers are strongly typed (MctpHeader/NvdmHeader instead
> of bare u32), with the vendor ID from kernel::pci::Vendor.
>
> * GB10x/GB20x sysmem flush: the HSHUB0/FBHUB0 register details moved
> from module doccomments onto the write_sysmem_flush_page_* methods.
>
> * Commit message cleanups: dropped stale claims, shortened an
> over-length subject, and fixed trailer ordering.
>
> Changes in v10:
>
> * Reordered per review (and direct assistance--thanks again) from
> Alexandre Courbot: the two refactoring patches (factor .fwsignature*
> selection, use GPU Architecture to simplify HALs) now come first,
> before GPU identification. The boot_via_fsp stub is introduced early
> and completed as FSP features arrive. The SEC2 refactoring, PCI config
> mirror, and reserved heap size patches are moved earlier in the
> series.
>
> * Made pmuReservedSize conditional on Blackwell dGPU architectures.
> Open RM only sets this field for Blackwell (Turing/Ampere/Ada/Hopper
> all leave it zero). Added calc_pmu_reserved_size() helper and
> FbLayout.pmu_reserved_size field to route the value through the
> layout instead of using the constant unconditionally. Replaced
> `as u32` cast with usize_into_u32 for PMU_RESERVED_SIZE. (Alexandre)
>
> * Split the GFW boot wait HAL change into two patches: one that moves
> the existing behavior into a GpuHal trait, and a second that adds the
> Hopper/Blackwell skip.
>
> * Removed the Spec::chipset() accessor (no longer needed after
> restructuring). Updated the Copy/Clone commit message accordingly.
>
> * Rebased onto drm-rust-next-staging, which includes
> const_align_up(), "move firmware image parsing code to firmware.rs",
> "factor out an elf_str() function", and "make WPR heap sizing
> fallible" from the v9 series. Series is now 28 patches (was 31).
>
> * Depends on the "rust: sizes: SizeConstants trait" series[N], which
> adds typed SZ_* constants (u64::SZ_1M, u32::SZ_4K, etc.). The
> nova-core conversion patch ("use SizeConstants trait for u64 size
> constants") will be posted separately, but is already included in my
> git branch. The Blackwell patches that introduce new SZ_* usage
> (larger non-WPR heap, FSP Chain of Trust boot, larger WPR2 heap) use
> the trait form from the start.
>
> * Fixed the PCI config mirror commit message: corrected hex offsets to
> match the code (older architectures use 0x088000, Hopper/Blackwell
> use 0x092000).
>
> * Dropped the never-used nvdm_type_raw() method from the MCTP/NVDM
> introducing patch.
>
> * Removed stale Co-developed-by tag from the FSP Chain of Trust boot
> commit per Alex's request. Rewrote the commit message to remove
> references to the no-longer-existent fmc_full field.
>
> * Added missing #[expect(dead_code)] on GspFmcBootParams in the FSP
> secure boot commit, removed when the struct becomes used in the
> Chain of Trust boot commit.
>
> Changes in v9:
>
> * Rebased onto today's drm-rust-next.
>
> * Split Architecture::Blackwell into BlackwellGB10x and BlackwellGB20x,
> after Gary Guo and Sashiko pointed out that GB10x and GB20x are
> distinct enough to warrant separate architecture variants. This
> surfaced several bugs where all Blackwell chips were incorrectly
> treated as a single group:
> * Fixed the FSP boot completion register address for GB10x. GB10x
> uses the same address as Hopper (0x000200bc), not the GB20x
> address (0x00ad00bc).
> * Made the FSP secure boot timeout architecture-dependent. GB20x
> now gets 5000ms while Hopper and GB10x keep 4000ms.
> * Removed chipset-level match arms that were working around the
> single-variant design in fb/hal.rs, firmware/gsp.rs, and regs.rs.
>
> * Simplified find_gsp_sigs_section() to return &'static str instead of
> Option<&'static str>, since the Architecture enum is now exhaustive
> and every variant has a known signature section name.
>
> * Moved dma_set_mask_and_coherent from probe() into Gpu::new(), with
> the unsafe block narrowed to just that call. Gpu::new() now takes
> pci::Device<device::Core> instead of device::Bound to support this.
>
> * Dropped the local `chipset` variable in Gpu::new() and accessed
> spec.chipset() directly, since Spec is now Copy.
>
> * Changed Spec::chipset() to take self instead of &self, since Spec is
> Copy.
>
> * Removed the unnecessary Tu102/Gh100 consts in gpu/hal.rs and used the
> unit structs directly.
>
> * Kept a hold on the Firmware object in FspFirmware instead of copying
> the FMC ELF into a KVec<u8>.
>
> * Moved the dev_info formatting fix and the GFW_BOOT comment removal
> out of the Copy/Clone patch and into the patches that actually touch
> those lines.
>
> * Added Reviewed-by tags from Gary Guo and Alice Ryhl.
>
> Changes in v8:
>
> * Added Clone/Copy derives to Spec and Revision. Removed the
> unnecessary pin_init_scope wrapping in Gpu::new() that the lack of
> Copy had forced. Added a Spec::chipset() accessor.
>
> * Removed implementation-detail sentence from the
> Architecture::dma_mask() doccomment.
>
> * Simplified the GPU HAL to two variants (Tu102, Gh100) instead of
> four. Renamed "Fsp" to "Gh100" to follow the HAL naming convention.
> Removed the spurious GA100 special case. Moved the GFW_BOOT wait into
> the HAL method itself instead of returning a bool.
>
> * Increased the GFW_BOOT wait timeout from 4 seconds to 30 seconds,
> after Joel found that a different Blackwell SKU required extra time.
>
> * Removed stray Cc lines from each patch.
>
> * Fixed rustfmt issues in gsp/fw.rs and gsp/boot.rs reported by the
> kernel test robot against v7 patches 27 and 31.
>
> Changes in v7:
> * Rebased onto Alexandre Courbot's rust register!() series in
> drm-rust-next, including the related generic I/O accessor and
> IoCapable changes.
>
> * Rebased onto drm-rust-next (v7.0-rc4 based).
>
> * Dropped the v6 patches that are already in drm-rust-next: the
> aux-device fix, the pdev helper macro patch, and the one-item-per-line
> use cleanup.
>
> * Reworked the GPU init pieces per review. DMA mask setup now stays in
> driver probe, with the mask width selected by GPU architecture, and
> the GFW boot policy now lives in a dedicated GPU HAL.
>
> * Reworked firmware image parsing per review around a single ElfFormat
> trait with associated header types. Also added support for both ELF32
> and ELF64 images, with automatic format detection.
>
> * Reworked the MCTP/NVDM protocol code to use bitfield! and typed
> accessors, removing the open-coded bit handling.
>
> * Reworked the FSP messaging part of the series so that the message
> structures are introduced in the first patches that use them, instead
> of as a standalone dead-code-only patch. Also changed fmc_full to use
> KVec<u8> from the start.
>
> * Split the WPR heap overflow handling out into a separate prep patch.
> That patch makes management_overhead() and wpr_heap_size() fallible,
> uses checked arithmetic, and leaves the larger WPR2 heap patch with
> only the Hopper and Blackwell sizing changes.
>
> * Added a code comment documenting the Hopper and Blackwell PCI config
> mirror base change.
>
> Changes in v6:
>
> * Rebased onto drm-rust-next (v7.0-rc1 based).
>
> * Dropped the first two patches from v5 (aux device fix and pdev
> macros), which have since been merged independently.
>
> * const_align_up(): reworked per review from Gary Guo, Miguel Ojeda,
> and Danilo Krummrich: now returns Option<usize> instead of panicking,
> takes an Alignment argument instead of a const generic, and no longer
> needs the inline_const feature addition in scripts/Makefile.build.
>
> * The rust/sizes and SZ_*_U64 patches from v5 are no longer included.
> I plan to post those as a separate series that depends on this one.
>
> Changes in v5:
>
> * Rebased onto linux.git master.
>
> * Split MCTP protocol into its own module and file.
>
> * Many Rust-based improvements: more use of types, especially. Also
> used Result and Option more.
>
> * Lots of cleanup of comments and print output and error handling.
>
> * Added const_align_up() to rust/ and used it in nova-core. This
> required enabling a Rust feature: inline_const, as recommended by
> Miguel Ojeda.
>
> * Refactoring various things, such as Gpu::new() to own Spec creation,
> and several more such things.
>
> * Fixed three Delta::ZERO busy-polls (patches 21, 24, 31) to use
> non-zero sleep intervals (after just realizing that it was a bad
> choice to have zero in there).
>
> * Reduced GH100/GB100 HAL duplication. Made FSP_PKEY_SIZE/FSP_SIG_SIZE
> consistent across patches. Replaced fragile architecture checks with
> chipset.arch(). Renamed LIBOS_BLACKWELL.
>
> * Narrowed the scope of some of the #![expect(dead_code)] cases,
> although that really only matters within the series, not once it is
> fully applied.
>
> [1] https://github.com/Gnurou/linux/commits/drm-rust-next-staging/
> [2] https://lore.kernel.org/20260411024118.471294-1-jhubbard@xxxxxxxxxx
>
> Alexandre Courbot (1):
> gpu: nova-core: gsp: enable FSP boot path
>
> Eliot Courtney (1):
> gpu: nova-core: add non-sec2 unload path
>
> John Hubbard (20):
> gpu: nova-core: set DMA mask width based on GPU architecture
> gpu: nova-core: Hopper/Blackwell: new location for PCI config mirror
> gpu: nova-core: Blackwell: compute PMU-reserved framebuffer size
> gpu: nova-core: Hopper/Blackwell: larger non-WPR heap
> gpu: nova-core: Hopper/Blackwell: larger WPR2 (GSP) heap
> gpu: nova-core: Blackwell: use correct sysmem flush registers
> gpu: nova-core: don't assume 64-bit firmware images
> gpu: nova-core: add support for 32-bit firmware images
> gpu: nova-core: add auto-detection of 32-bit, 64-bit firmware images
> gpu: nova-core: Hopper/Blackwell: add FSP falcon engine stub
> gpu: nova-core: Hopper/Blackwell: add FMC firmware image
> gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion
> waiting
> gpu: nova-core: Hopper/Blackwell: add FMC signature extraction
> gpu: nova-core: Hopper/Blackwell: add FSP falcon EMEM operations
> gpu: nova-core: Hopper/Blackwell: add FSP message infrastructure
> gpu: nova-core: add MCTP/NVDM protocol types for firmware
> communication
> gpu: nova-core: Hopper/Blackwell: add FSP send/receive messaging
> gpu: nova-core: Hopper/Blackwell: add FspCotVersion type
> gpu: nova-core: Hopper/Blackwell: add FSP Chain of Trust boot
> gpu: nova-core: Hopper/Blackwell: add GSP lockdown release polling
>
> drivers/gpu/nova-core/driver.rs | 15 -
> drivers/gpu/nova-core/falcon.rs | 1 +
> drivers/gpu/nova-core/falcon/fsp.rs | 202 +++++++++++
> drivers/gpu/nova-core/fb.rs | 8 +-
> drivers/gpu/nova-core/fb/hal.rs | 28 +-
> drivers/gpu/nova-core/fb/hal/ga100.rs | 5 +
> drivers/gpu/nova-core/fb/hal/ga102.rs | 7 +-
> drivers/gpu/nova-core/fb/hal/gb100.rs | 102 ++++++
> drivers/gpu/nova-core/fb/hal/gb202.rs | 86 +++++
> drivers/gpu/nova-core/fb/hal/gh100.rs | 50 +++
> drivers/gpu/nova-core/fb/hal/tu102.rs | 9 +
> drivers/gpu/nova-core/firmware.rs | 176 +++++++--
> drivers/gpu/nova-core/firmware/fsp.rs | 129 +++++++
> drivers/gpu/nova-core/firmware/gsp.rs | 4 +-
> drivers/gpu/nova-core/fsp.rs | 334 ++++++++++++++++++
> drivers/gpu/nova-core/fsp/hal.rs | 27 ++
> drivers/gpu/nova-core/fsp/hal/gb202.rs | 23 ++
> drivers/gpu/nova-core/fsp/hal/gh100.rs | 23 ++
> drivers/gpu/nova-core/gpu.rs | 34 +-
> drivers/gpu/nova-core/gpu/hal.rs | 13 +-
> drivers/gpu/nova-core/gpu/hal/gh100.rs | 18 +-
> drivers/gpu/nova-core/gpu/hal/tu102.rs | 14 +
> drivers/gpu/nova-core/gsp.rs | 1 +
> drivers/gpu/nova-core/gsp/boot.rs | 2 +-
> drivers/gpu/nova-core/gsp/commands.rs | 8 +-
> drivers/gpu/nova-core/gsp/fw.rs | 85 ++++-
> drivers/gpu/nova-core/gsp/fw/commands.rs | 15 +-
> .../gpu/nova-core/gsp/fw/r570_144/bindings.rs | 83 +++++
> drivers/gpu/nova-core/gsp/hal/gh100.rs | 166 ++++++++-
> drivers/gpu/nova-core/mctp.rs | 100 ++++++
> drivers/gpu/nova-core/nova_core.rs | 2 +
> drivers/gpu/nova-core/regs.rs | 111 ++++++
> 32 files changed, 1800 insertions(+), 81 deletions(-)
> create mode 100644 drivers/gpu/nova-core/falcon/fsp.rs
> create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
> create mode 100644 drivers/gpu/nova-core/fb/hal/gb202.rs
> create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
> create mode 100644 drivers/gpu/nova-core/firmware/fsp.rs
> create mode 100644 drivers/gpu/nova-core/fsp.rs
> create mode 100644 drivers/gpu/nova-core/fsp/hal.rs
> create mode 100644 drivers/gpu/nova-core/fsp/hal/gb202.rs
> create mode 100644 drivers/gpu/nova-core/fsp/hal/gh100.rs
> create mode 100644 drivers/gpu/nova-core/mctp.rs
>
>
> base-commit: 2cfcf9dfb48e932d46c3fa9ae99f1607d1a80162