Re: [PATCH v9 19/31] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting

From: Alexandre Courbot

Date: Tue Apr 07 2026 - 22:04:11 EST


On Thu Mar 26, 2026 at 10:38 AM JST, John Hubbard wrote:
> Add the FSP (Firmware System Processor) module for Hopper/Blackwell GPUs.
> These architectures use a simplified firmware boot sequence:
>
> FMC --> FSP --> GSP, with no SEC2 involvement.
>
> This commit adds the ability to wait for FSP secure boot completion by
> polling the I2CS thermal scratch register until FSP signals success.

This does more than just the boot completion waiting (which is just the
`wait_secure_boot` method) - it looks like most of the content of
`fsp.rs` should be moved to different patches.

>
> Signed-off-by: John Hubbard <jhubbard@xxxxxxxxxx>
> ---
> drivers/gpu/nova-core/fsp.rs | 148 +++++++++++++++++++++++++++++
> drivers/gpu/nova-core/nova_core.rs | 1 +
> drivers/gpu/nova-core/regs.rs | 29 ++++++
> 3 files changed, 178 insertions(+)
> create mode 100644 drivers/gpu/nova-core/fsp.rs
>
> diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
> new file mode 100644
> index 000000000000..6d32e03d89f9
> --- /dev/null
> +++ b/drivers/gpu/nova-core/fsp.rs
> @@ -0,0 +1,148 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! FSP (Firmware System Processor) interface for Hopper/Blackwell GPUs.
> +//!
> +//! Hopper/Blackwell use a simplified firmware boot sequence: FMC --> FSP --> GSP.
> +//! Unlike Turing/Ampere/Ada, there is NO SEC2 (Security Engine 2) usage.
> +//! FSP handles secure boot directly using FMC firmware + Chain of Trust.
> +
> +use kernel::{
> + device,
> + io::poll::read_poll_timeout,
> + prelude::*,
> + time::Delta,
> + transmute::{
> + AsBytes,
> + FromBytes, //
> + },
> +};
> +
> +use crate::regs;
> +
> +/// FSP secure boot completion timeout in milliseconds.
> +///
> +/// GB20x requires a longer timeout than Hopper/GB10x.
> +const fn fsp_secure_boot_timeout_ms(arch: crate::gpu::Architecture) -> i64 {
> + match arch {
> + crate::gpu::Architecture::BlackwellGB20x => 5000,
> + _ => 4000,
> + }
> +}

Since this is a timeout, how about harmonizing to 5000 for everyone and
turning this into a constant? Waiting 1 more second in case of a boot
failure should be acceptable. :)