Re: [PATCH 05/13] gpu: nova-core: gsp: keep FMC boot params DMA region alive during error

From: Eliot Courtney

Date: Wed Jun 17 2026 - 01:27:44 EST


On Tue Jun 16, 2026 at 2:23 AM JST, Gary Guo wrote:
> On Mon Jun 15, 2026 at 3:40 PM BST, Eliot Courtney wrote:
>> Currently, if, for example `boot_fmc` fails, `FmcBootArgs` will be
>> dropped before the boot unload guard. But until everything is unloaded,
>> GSP may access this memory, so make sure it doesn't get deallocated.
>
> Hmm, this looks very weirld. `boot_fmc` only needs `&args` but it actually need
> it for much longer?
>
> This is hinting to me that the signature is wrong of the `boot_fmc` function is
> wrong..
>
> What is the exact lifetime requirement for GSP?

Once `wait_for_gsp_lockdown_release` returns, it no longer needs the
allocation. In the case that we get an error in this function, GSP may
be (asychronously to the CPU) accessing the DMA memory we gave it. So
FmcBootArgs should stay alive until FspUnloadBundle finishes (GSP is
reset) if there is an error, or, until `wait_for_gsp_lockdown_release`
returns successfully.

So for `boot_fmc` to cover the actual lifetime it would need to be
responsible for waiting until GSP is reset, which it doesn't feel like
it should be responsible for (even as some unload bundle that it
returns).

An alternative could be making another local guard holding the
reference:
```
let unload_guard = FmcBootGuard::new(
BootUnloadGuard::new(gsp, dev, bar, gsp_falcon, sec2_falcon, Some(unload_bundle)),
&args,
);
```

WDYT?

>
>>
>> Signed-off-by: Eliot Courtney <ecourtney@xxxxxxxxxx>
>> ---
>> drivers/gpu/nova-core/gsp/hal/gh100.rs | 20 ++++++++++----------
>> 1 file changed, 10 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/nova-core/gsp/hal/gh100.rs b/drivers/gpu/nova-core/gsp/hal/gh100.rs
>> index 98f5ce197d13..b08761af89d3 100644
>> --- a/drivers/gpu/nova-core/gsp/hal/gh100.rs
>> +++ b/drivers/gpu/nova-core/gsp/hal/gh100.rs
>> @@ -162,16 +162,6 @@ fn boot<'a>(
>> ) -> Result<BootUnloadGuard<'a>> {
>> let fsp_fw = FspFirmware::new(dev, chipset, FIRMWARE_VERSION)?;
>>
>> - let unload_bundle = crate::gsp::UnloadBundle(
>> - KBox::new(FspUnloadBundle, GFP_KERNEL)? as KBox<dyn UnloadBundle>
>> - );
>> -
>> - // Wrap the unload bundle into a drop guard so it is automatically run upon failure.
>> - let unload_guard =
>> - BootUnloadGuard::new(gsp, dev, bar, gsp_falcon, sec2_falcon, Some(unload_bundle));
>> -
>> - let mut fsp = Fsp::wait_secure_boot(dev, bar, chipset, fsp_fw)?;
>> -
>> let args = FmcBootArgs::new(
>> dev,
>> chipset,
>> @@ -180,6 +170,16 @@ fn boot<'a>(
>> false,
>> )?;
>>
>> + let mut fsp = Fsp::wait_secure_boot(dev, bar, chipset, fsp_fw)?;
>> +
>> + let unload_bundle = crate::gsp::UnloadBundle(
>> + KBox::new(FspUnloadBundle, GFP_KERNEL)? as KBox<dyn UnloadBundle>
>> + );
>> +
>> + // Wrap the unload bundle into a drop guard so it is automatically run upon failure.
>> + let unload_guard =
>> + BootUnloadGuard::new(gsp, dev, bar, gsp_falcon, sec2_falcon, Some(unload_bundle));
>> +
>
> This usual "usage beyond reference lifetime" needs to be at least explicitly
> mentioned here.
>
> Best,
> Gary
>
>> fsp.boot_fmc(dev, bar, fb_layout, &args)?;
>>
>> wait_for_gsp_lockdown_release(dev, bar, gsp_falcon, args.boot_params_dma_handle())?;