Re: [PATCH v10 07/21] gpu: nova-core: mm: Add TLB flush support
From: Matthew Brost
Date: Tue Apr 07 2026 - 01:14:48 EST
On Mon, Apr 06, 2026 at 06:10:07PM -0400, Joel Fernandes wrote:
>
>
> On 4/6/2026 5:24 PM, Joel Fernandes wrote:
> >
> >
> > On 4/2/2026 1:59 AM, Matthew Brost wrote:
> >> On Tue, Mar 31, 2026 at 05:20:34PM -0400, Joel Fernandes wrote:
> >>> Add TLB (Translation Lookaside Buffer) flush support for GPU MMU.
> >>>
> >>> After modifying page table entries, the GPU's TLB must be invalidated
> >>> to ensure the new mappings take effect. The Tlb struct provides flush
> >>> functionality through BAR0 registers.
> >>>
> >>> The flush operation writes the page directory base address and triggers
> >>> an invalidation, polling for completion with a 2 second timeout matching
> >>> the Nouveau driver.
> >>>
> >>> Cc: Nikola Djukic <ndjukic@xxxxxxxxxx>
> >>> Signed-off-by: Joel Fernandes <joelagnelf@xxxxxxxxxx>
> >>> ---
> >>> drivers/gpu/nova-core/mm.rs | 1 +
> >>> drivers/gpu/nova-core/mm/tlb.rs | 95 +++++++++++++++++++++++++++++++++
> >>> drivers/gpu/nova-core/regs.rs | 42 +++++++++++++++
> >>> 3 files changed, 138 insertions(+)
> >>> create mode 100644 drivers/gpu/nova-core/mm/tlb.rs
> >>>
> >>> diff --git a/drivers/gpu/nova-core/mm.rs b/drivers/gpu/nova-core/mm.rs
> >>> index 8f3089a5fa88..cfe9cbe11d57 100644
> >>> --- a/drivers/gpu/nova-core/mm.rs
> >>> +++ b/drivers/gpu/nova-core/mm.rs
> >>> @@ -5,6 +5,7 @@
> >>> #![expect(dead_code)]
> >>>
> >>> pub(crate) mod pramin;
> >>> +pub(crate) mod tlb;
> >>>
> >>> use kernel::sizes::SZ_4K;
> >>>
> >>> diff --git a/drivers/gpu/nova-core/mm/tlb.rs b/drivers/gpu/nova-core/mm/tlb.rs
> >>> new file mode 100644
> >>> index 000000000000..cd3cbcf4c739
> >>> --- /dev/null
> >>> +++ b/drivers/gpu/nova-core/mm/tlb.rs
> >>> @@ -0,0 +1,95 @@
> >>> +// SPDX-License-Identifier: GPL-2.0
> >>> +
> >>> +//! TLB (Translation Lookaside Buffer) flush support for GPU MMU.
> >>> +//!
> >>> +//! After modifying page table entries, the GPU's TLB must be flushed to
> >>> +//! ensure the new mappings take effect. This module provides TLB flush
> >>> +//! functionality for virtual memory managers.
> >>> +//!
> >>> +//! # Example
> >>> +//!
> >>> +//! ```ignore
> >>> +//! use crate::mm::tlb::Tlb;
> >>> +//!
> >>> +//! fn page_table_update(tlb: &Tlb, pdb_addr: VramAddress) -> Result<()> {
> >>> +//! // ... modify page tables ...
> >>> +//!
> >>> +//! // Flush TLB to make changes visible (polls for completion).
> >>> +//! tlb.flush(pdb_addr)?;
> >>> +//!
> >>> +//! Ok(())
> >>> +//! }
> >>> +//! ```
> >>> +
> >>> +use kernel::{
> >>> + devres::Devres,
> >>> + io::poll::read_poll_timeout,
> >>> + io::Io,
> >>> + new_mutex,
> >>> + prelude::*,
> >>> + sync::{
> >>> + Arc,
> >>> + Mutex, //
> >>> + },
> >>> + time::Delta, //
> >>> +};
> >>> +
> >>> +use crate::{
> >>> + driver::Bar0,
> >>> + mm::VramAddress,
> >>> + regs, //
> >>> +};
> >>> +
> >>> +/// TLB manager for GPU translation buffer operations.
> >>> +#[pin_data]
> >>> +pub(crate) struct Tlb {
> >>> + bar: Arc<Devres<Bar0>>,
> >>> + /// TLB flush serialization lock: This lock is acquired during the
> >>> + /// DMA fence signalling critical path. It must NEVER be held across any
> >>> + /// reclaimable CPU memory allocations because the memory reclaim path can
> >>> + /// call `dma_fence_wait()`, which would deadlock with this lock held.
> >>> + #[pin]
> >>> + lock: Mutex<()>,
> >>> +}
> >>> +
> >>> +impl Tlb {
> >>> + /// Create a new TLB manager.
> >>> + pub(super) fn new(bar: Arc<Devres<Bar0>>) -> impl PinInit<Self> {
> >>> + pin_init!(Self {
> >>> + bar,
> >>> + lock <- new_mutex!((), "tlb_flush"),
> >>> + })
> >>> + }
> >>> +
> >>> + /// Flush the GPU TLB for a specific page directory base.
> >>> + ///
> >>> + /// This invalidates all TLB entries associated with the given PDB address.
> >>> + /// Must be called after modifying page table entries to ensure the GPU sees
> >>> + /// the updated mappings.
> >>> + pub(crate) fn flush(&self, pdb_addr: VramAddress) -> Result {
> >>
> >> This landed on my list randomly, so I took a look.
> >>
> >> Wouldn’t you want to virtualize the invalidation based on your device?
> >> For example, what if you need to register interface changes on future hardware?
> >
> > Good point, for future hardware it indeed makes sense. I will do that.
> Actually, at least in the future as far as I can see, the register definitions
> are the same for TLB invalidation are the same, so we are good and I will not be
> making any change in this regard.
>
> But, thanks for raising the point and forcing me to double check!
>
Not my driver, but this looks like a classic “works now” change that may
not hold up later, which is why I replied to something that isn’t really
my business.
Again, not my area, but I’ve been through this before. Generally,
getting the abstractions right up front pays off.
Matt
> --
> Joel Fernandes
>