Re: [PATCH 1/2] gpu: nova-core: Hopper: use correct sysmem flush registers
From: John Hubbard
Date: Tue Jun 09 2026 - 22:47:45 EST
On 6/9/26 6:54 AM, Alexandre Courbot wrote:
> On Thu Jun 4, 2026 at 8:50 AM JST, John Hubbard wrote:
> <...>
>> diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
>> index 0f49c1ab83ad..a57e95140ec0 100644
>> --- a/drivers/gpu/nova-core/regs.rs
>> +++ b/drivers/gpu/nova-core/regs.rs
>> @@ -131,6 +131,22 @@ fn fmt(&self, f: &mut kernel::fmt::Formatter<'_>) -> kernel::fmt::Result {
>> 23:0 adr_63_40;
>> }
>>
>> + /// Low bits of the physical system memory address used by the GPU to perform sysmembar
>> + /// operations on Hopper (see [`crate::fb::SysmemFlush`]).
>> + ///
>> + /// Unlike the Ampere `NV_PFB_NISO_FLUSH_SYSMEM_ADDR` registers, which encode the address with
>> + /// an 8-bit right-shift, the Hopper FBHUB registers take the raw address split into lower and
>> + /// upper halves. Hardware ignores bits 7:0 of the LO register.
>> + pub(crate) NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ 0x00100a34 {
>> + 31:0 adr => u32;
>> + }
>> +
>> + /// High bits of the physical system memory address used by the GPU to perform sysmembar
>> + /// operations on Hopper (see [`crate::fb::SysmemFlush`]).
>> + pub(crate) NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ 0x00100a38 {
>> + 19:0 adr;
>> + }
>> +
>> pub(crate) NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE(u32) @ 0x00100ce0 {
>> 30:30 ecc_mode_enabled => bool;
>> 9:4 lower_mag;
>> @@ -179,15 +195,16 @@ fn fmt(&self, f: &mut kernel::fmt::Formatter<'_>) -> kernel::fmt::Result {
>> 19:0 adr;
>> }
>>
>> - // GB20x sysmem flush registers, relative to the FBHUB0 base. Unlike the older
>> - // NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers which encode the address with an 8-bit
>> - // right-shift, these take the raw address split into lower and upper halves. Hardware
>> + // GB20x sysmem flush registers, relative to the FBHUB0 base. Like the Hopper
>> + // NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR registers, and unlike the older
>> + // NV_PFB_NISO_FLUSH_SYSMEM_ADDR registers (which encode the address with an 8-bit
>> + // right-shift), these take the raw address split into lower and upper halves. Hardware
>> // ignores bits 7:0 of the LO register.
>> - pub(crate) NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ Fbhub0Base + 0x00001d58 {
>> + pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_LO(u32) @ Fbhub0Base + 0x00001d58 {
>> 31:0 adr => u32;
>> }
>>
>> - pub(crate) NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ Fbhub0Base + 0x00001d5c {
>> + pub(crate) NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI(u32) @ Fbhub0Base + 0x00001d5c {
>
> Since we are switching the name to `FBHUB0`, wouldn't it make sense to
> make this an absolute register as well? Because now this reads like a
> FBHUB0 register using Fbhub0 as its base, which is redundant.
>
> I know I am the one who suggested making it relative, but after looking
> at OpenRM again it doesn't really seem to make sense - there is really
> only one base here, and the register is defined as an absolute one
> anyway. So my bad for suggesting this in the first place.
>
> If we do this, maybe split into two patches:
>
> - Rename `NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI` into
> `NV_PFB_FBHUB0_PCIE_FLUSH_SYSMEM_ADDR_HI` and make it absolute to
> align with OpenRM,
> - Introduce Hopper's `NV_PFB_FBHUB_PCIE_FLUSH_SYSMEM_ADDR_HI` and use it.
>
> This keeps the changes local and simple, as renaming makes things harder
> to track.
Ok yes, I'll post a v2 with the above approach.
>
> As a side-note (not for this patch), we may want to move registers move
> aggressively into sub-modules. The Hopper register seems to be only used
> on Hopper, so it would make sense to me to move it into a `gh100`
> submodule when we eventually move these definitions under `fb/regs.rs`
> as is the plan.
Ack, some register organization is definitely called for, and that's
a solid first step.
>
> Or maybe we can have an even more aggressive policy (which iirc has
> already been mentioned on another patchset) of defining the registers
> directly into the HALs themselves. Subsystems like `fb` only ever access
> their registers in the HALs, so for them at least that could make a lot
> of sense.
Perhaps, let's see.
thanks,
--
John Hubbard