Re: [PATCH v10 24/28] gpu: nova-core: Hopper/Blackwell: larger non-WPR heap

From: Alexandre Courbot

Date: Sun Apr 19 2026 - 22:54:09 EST

On Sat Apr 11, 2026 at 11:49 AM JST, John Hubbard wrote:
> Add dedicated FB HALs for Hopper (GH100) and Blackwell (GB100) with
> architecture-specific non-WPR heap sizes. Hopper uses 2 MiB, Blackwell
> uses 2 MiB + 128 KiB. These are needed for the larger reserved memory
> regions that Hopper/Blackwell GPUs require.
>
> Also adds the non_wpr_heap_size() method to the FbHal trait, and
> the total_reserved_size field to FbLayout.
>
> Signed-off-by: John Hubbard <jhubbard@xxxxxxxxxx>

Let's move this patch earlier in the series, e.g. right after "new
location for PCI config mirror".

This will keep all the architecture setup/non-messaging patches
together, and keep the more complex message queue ones at the end,
forming a more logical flow and letting us merge these ones first
(hopefully reducing the series to just the FSP messaging patches soon).

> ---
> drivers/gpu/nova-core/fb.rs | 14 +++++++---
> drivers/gpu/nova-core/fb/hal.rs | 19 +++++++++-----
> drivers/gpu/nova-core/fb/hal/ga102.rs | 2 +-
> drivers/gpu/nova-core/fb/hal/gb100.rs | 38 +++++++++++++++++++++++++++
> drivers/gpu/nova-core/fb/hal/gh100.rs | 38 +++++++++++++++++++++++++++
> 5 files changed, 101 insertions(+), 10 deletions(-)
> create mode 100644 drivers/gpu/nova-core/fb/hal/gb100.rs
> create mode 100644 drivers/gpu/nova-core/fb/hal/gh100.rs
>
> diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
> index c2005e4b4177..756ff283a908 100644
> --- a/drivers/gpu/nova-core/fb.rs
> +++ b/drivers/gpu/nova-core/fb.rs
> @@ -103,6 +103,15 @@ pub(crate) fn unregister(&self, bar: &Bar0) {
> }
> }
>
> +/// Calculate non-WPR heap size based on chipset architecture.
> +/// This matches the logic used in FSP for consistency.
> +pub(crate) fn calc_non_wpr_heap_size(chipset: Chipset) -> u64 {
> + hal::fb_hal(chipset)
> + .non_wpr_heap_size()
> + .map(u64::from)
> + .unwrap_or(u64::SZ_1M)
> +}

This method is only ever called in a single place (which already has the
HAL at hand), let's just inline it. The name is misleading as it doesn't
calculate anything, it just returns a fixed per-chipset value. The
comment is also not particularly informative so we won't lose much by
dropping it.

> +
> pub(crate) struct FbRange(Range<u64>);
>
> impl FbRange {
> @@ -262,9 +271,8 @@ pub(crate) fn new(chipset: Chipset, bar: &Bar0, gsp_fw: &GspFirmware) -> Result<
> };
>
> let heap = {
> - const HEAP_SIZE: u64 = u64::SZ_1M;
> -
> - FbRange(wpr2.start - HEAP_SIZE..wpr2.start)
> + let heap_size = calc_non_wpr_heap_size(chipset);
> + FbRange(wpr2.start - heap_size..wpr2.start)
> };
>
> Ok(Self {
> diff --git a/drivers/gpu/nova-core/fb/hal.rs b/drivers/gpu/nova-core/fb/hal.rs
> index 3b3bad0feed0..478f80d640c1 100644
> --- a/drivers/gpu/nova-core/fb/hal.rs
> +++ b/drivers/gpu/nova-core/fb/hal.rs
> @@ -12,6 +12,8 @@
>
> mod ga100;
> mod ga102;
> +mod gb100;
> +mod gh100;
> mod tu102;
>
> pub(crate) trait FbHal {
> @@ -28,17 +30,22 @@ pub(crate) trait FbHal {
>
> /// Returns the VRAM size, in bytes.
> fn vidmem_size(&self, bar: &Bar0) -> u64;
> +
> + /// Returns the non-WPR heap size for GPUs that need large reserved memory.
> + ///
> + /// Returns `None` for GPUs that don't need extra reserved memory.
> + fn non_wpr_heap_size(&self) -> Option<u32> {
> + None
> + }

This HAL method is bizarre.

Why return an `Option` when `0` expresses "I don't need extra memory"
just as well?

Then there's the user of this HAL defaulting to `SZ_1M` if it returned
`None`, which was the default value for `heap`. But if it returns
something else than `None`, then we just use that value for `heap`,
without adding `SZ_1M`... So what it appears to do is that it returns
the total amount of heap, except when it returns `None` in which case
the heap value should be `SZ_1M`?

If that's the case, then why not just have a `heap_size` that simply
returns the amount of heap needed (without any `Option`), and just use
that without further complications?

Please confirm whether the values returned in the HAL are actually total
heap size, and if they are, let's generalize the HAL to all GPU
generations.

> }
>
> /// Returns the HAL corresponding to `chipset`.
> -pub(super) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {
> +pub(crate) fn fb_hal(chipset: Chipset) -> &'static dyn FbHal {

The visibility of this method doesn't need to be changed.

> match chipset.arch() {
> Architecture::Turing => tu102::TU102_HAL,
> Architecture::Ampere if chipset == Chipset::GA100 => ga100::GA100_HAL,
> - Architecture::Ampere => ga102::GA102_HAL,
> - Architecture::Ada
> - | Architecture::Hopper
> - | Architecture::BlackwellGB10x
> - | Architecture::BlackwellGB20x => ga102::GA102_HAL,
> + Architecture::Ampere | Architecture::Ada => ga102::GA102_HAL,
> + Architecture::Hopper => gh100::GH100_HAL,
> + Architecture::BlackwellGB10x | Architecture::BlackwellGB20x => gb100::GB100_HAL,
> }
> }
> diff --git a/drivers/gpu/nova-core/fb/hal/ga102.rs b/drivers/gpu/nova-core/fb/hal/ga102.rs
> index 4b9f0f74d0e7..79c5a44f6a29 100644
> --- a/drivers/gpu/nova-core/fb/hal/ga102.rs
> +++ b/drivers/gpu/nova-core/fb/hal/ga102.rs
> @@ -11,7 +11,7 @@
> regs, //
> };
>
> -fn vidmem_size_ga102(bar: &Bar0) -> u64 {
> +pub(super) fn vidmem_size_ga102(bar: &Bar0) -> u64 {
> bar.read(regs::NV_USABLE_FB_SIZE_IN_MB).usable_fb_size()
> }
>
> diff --git a/drivers/gpu/nova-core/fb/hal/gb100.rs b/drivers/gpu/nova-core/fb/hal/gb100.rs
> new file mode 100644
> index 000000000000..bead99a6ca76
> --- /dev/null
> +++ b/drivers/gpu/nova-core/fb/hal/gb100.rs
> @@ -0,0 +1,38 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +use kernel::prelude::*;
> +
> +use crate::{
> + driver::Bar0,
> + fb::hal::FbHal, //
> +};
> +
> +struct Gb100;
> +
> +impl FbHal for Gb100 {
> + fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
> + super::ga100::read_sysmem_flush_page_ga100(bar)
> + }
> +
> + fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
> + super::ga100::write_sysmem_flush_page_ga100(bar, addr);
> +
> + Ok(())
> + }
> +
> + fn supports_display(&self, bar: &Bar0) -> bool {
> + super::ga100::display_enabled_ga100(bar)
> + }
> +
> + fn vidmem_size(&self, bar: &Bar0) -> u64 {
> + super::ga102::vidmem_size_ga102(bar)
> + }
> +
> + fn non_wpr_heap_size(&self) -> Option<u32> {
> + // 2 MiB + 128 KiB non-WPR heap for Blackwell (see Open RM: kgspCalculateFbLayout_GB100).
> + Some(0x220000)

I could not find any function named `kgspCalculateFbLayout_GB100` in
OpenRM. There is one for GH100, but not for GB100. Also this method
computes the whole layout, the relevant one would be
`kgspGetNonWprHeapSize`.

Also I know this is a bit overkill, but let's make this a module-level
constant so other HALs can refer to it if needed.

> + }
> +}
> +
> +const GB100: Gb100 = Gb100;
> +pub(super) const GB100_HAL: &dyn FbHal = &GB100;
> diff --git a/drivers/gpu/nova-core/fb/hal/gh100.rs b/drivers/gpu/nova-core/fb/hal/gh100.rs
> new file mode 100644
> index 000000000000..32d7414e6243
> --- /dev/null
> +++ b/drivers/gpu/nova-core/fb/hal/gh100.rs
> @@ -0,0 +1,38 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +use kernel::prelude::*;
> +
> +use crate::{
> + driver::Bar0,
> + fb::hal::FbHal, //
> +};
> +
> +struct Gh100;
> +
> +impl FbHal for Gh100 {
> + fn read_sysmem_flush_page(&self, bar: &Bar0) -> u64 {
> + super::ga100::read_sysmem_flush_page_ga100(bar)
> + }
> +
> + fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result {
> + super::ga100::write_sysmem_flush_page_ga100(bar, addr);
> +
> + Ok(())
> + }
> +
> + fn supports_display(&self, bar: &Bar0) -> bool {
> + super::ga100::display_enabled_ga100(bar)
> + }
> +
> + fn vidmem_size(&self, bar: &Bar0) -> u64 {
> + super::ga102::vidmem_size_ga102(bar)
> + }
> +
> + fn non_wpr_heap_size(&self) -> Option<u32> {
> + // 2 MiB non-WPR heap for Hopper (see Open RM: kgspCalculateFbLayout_GH100).

Here also let's remove the reference to `kgspCalculateFbLayout_GH100` -
it does exist as of 570.144, but has been removed in `main`.
`kgspGetNonWprHeapSize` is a more accurate reference if we need one.