Re: [PATCH 1/4] gpu: nova-core: move GSP unload state to a pinned Gpu subobject
From: Eliot Courtney
Date: Wed Jun 10 2026 - 08:32:08 EST
On Wed Jun 10, 2026 at 8:18 PM JST, Alexandre Courbot wrote:
> On Wed Jun 10, 2026 at 12:52 PM JST, Eliot Courtney wrote:
>> On Tue Jun 9, 2026 at 5:03 PM JST, Alexandre Courbot wrote:
>>> `Gpu` currently owns the state needed to unload the GSP directly. This
>>> means that `unload_bundle` has to be the last initialized field: once GSP
>>> boot succeeds, any later initialization failure would leave `Gpu`
>>> partially initialized, and its `PinnedDrop` implementation would not run.
>>>
>>> This prevents adding fallible `Gpu` fields that need to query the GSP
>>> after it has booted.
>>>
>>> Move the GSP state and unload bundle into a dedicated pinned
>>> `GspResources` object. Once that subobject has been initialized, its
>>> `PinnedDrop` implementation will run even if initialization of a later
>>> `Gpu` field fails, ensuring that the GSP unload sequence is executed.
>>>
>>> Signed-off-by: Alexandre Courbot <acourbot@xxxxxxxxxx>
>>> ---
>>> drivers/gpu/nova-core/gpu.rs | 86 +++++++++++++++++++++++++-------------------
>>> 1 file changed, 49 insertions(+), 37 deletions(-)
>>>
>>> diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
>>> index b3c91731db45..6b3e02c71dee 100644
>>> --- a/drivers/gpu/nova-core/gpu.rs
>>> +++ b/drivers/gpu/nova-core/gpu.rs
>>> @@ -262,35 +262,59 @@ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
>>> }
>>> }
>>>
>>> -/// Structure holding the resources required to operate the GPU.
>>> +/// Self-contained resources to operate and drop the GSP.
>>> #[pin_data(PinnedDrop)]
>>> -pub(crate) struct Gpu<'gpu> {
>>> +struct GspResources<'gpu> {
>>> /// Device owning the GPU.
>>> device: &'gpu device::Device<device::Bound>,
>>> - spec: Spec,
>>> /// MMIO mapping of PCI BAR 0.
>>> bar: Bar0<'gpu>,
>>> - /// System memory page required for flushing all pending GPU-side memory writes done through
>>> - /// PCIE into system memory, via sysmembar (A GPU-initiated HW memory-barrier operation).
>>> - sysmem_flush: SysmemFlush<'gpu>,
>>> /// GSP falcon instance, used for GSP boot up and cleanup.
>>> gsp_falcon: Falcon<GspFalcon>,
>>> /// SEC2 falcon instance, used for GSP boot up and cleanup.
>>> sec2_falcon: Falcon<Sec2Falcon>,
>>> - /// GSP runtime data. Temporarily an empty placeholder.
>>> + /// GSP runtime data.
>>> #[pin]
>>> gsp: Gsp,
>>> /// GSP unload firmware bundle, if any.
>>> unload_bundle: Option<gsp::UnloadBundle>,
>>> }
>>>
>>> +/// Structure holding the resources required to operate the GPU.
>>> +#[pin_data]
>>> +pub(crate) struct Gpu<'gpu> {
>>> + spec: Spec,
>>> + /// System memory page required for flushing all pending GPU-side memory writes done through
>>> + /// PCIE into system memory, via sysmembar (A GPU-initiated HW memory-barrier operation).
>>> + sysmem_flush: SysmemFlush<'gpu>,
>>
>> This means sysmem_flush is dropped before unload is run. Before this
>> patch, PinnedDrop runs the unload bundle before sysmem_flush's drop
>> actually runs. But with this code it'll drop sysmem_flush first, and
>> that isn't allowed according to the comment in fb.rs saying that it's
>> needed for falcon reset. What about sysmem flush into GspResources as
>> well?
>
> Yes, Sashiko also found the same issue actually. I will fix that by
> declaring `sysmem_flush` after `gsp_resources` in `Gpu` so it is dropped
> after. Initialization is done in the given order by `pin_init!`, so we
> can still initialize `sysmem_flush` before `gsp_resources`.
With that change,
Reviewed-by: Eliot Courtney <ecourtney@xxxxxxxxxx>