Re: [PATCH v10 12/21] gpu: nova-core: mm: Add unified page table entry wrapper enums

From: Alexandre Courbot

Date: Wed Apr 08 2026 - 03:06:48 EST

On Tue Apr 7, 2026 at 10:59 PM JST, Joel Fernandes wrote:
> Hi Eliot,
>
> On 4/7/2026 9:42 AM, Eliot Courtney wrote:
>> On Tue Apr 7, 2026 at 6:55 AM JST, Joel Fernandes wrote:
>>>>> + /// Compute upper bound on page table pages needed for `num_virt_pages`.
>>>>> + ///
>>>>> + /// Walks from PTE level up through PDE levels, accumulating the tree.
>>>>> + pub(crate) fn pt_pages_upper_bound(&self, num_virt_pages: usize) -> usize {
>>>>> + let mut total = 0;
>>>>> +
>>>>> + // PTE pages at the leaf level.
>>>>> + let pte_epp = self.entries_per_page(self.pte_level());
>>>>> + let mut pages_at_level = num_virt_pages.div_ceil(pte_epp);
>>>>> + total += pages_at_level;
>>>>> +
>>>>> + // Walk PDE levels bottom-up (reverse of pde_levels()).
>>>>> + for &level in self.pde_levels().iter().rev() {
>>>>> + let epp = self.entries_per_page(level);
>>>>> +
>>>>> + // How many pages at this level do we need to point to
>>>>> + // the previous pages_at_level?
>>>>> + pages_at_level = pages_at_level.div_ceil(epp);
>>>>> + total += pages_at_level;
>>>>> + }
>>>>> +
>>>>> + total
>>>>> + }
>>>>> +}
>>>>> +
>>>>
>>>> We have a lot of matches on the MMU version here (and below in Pte, Pde,
>>>> DualPde). What about making MmuVersion into a trait (e.g. Mmu) with
>>>> associated types for Pte, Pde, DualPde which can implement traits
>>>> defining their common operations too?
>>>
>>> I coded this up and it did not look pretty, there's not much LOC savings and the
>>> code becomes harder to read because of parametrization of several functions. Also:
>>
>> Thanks for looking into it. Sorry to be a bother, but would you have a
>> branch around with the code? I'm curious what didn't look good about it.
>
> Sorry but I already mentioned that above, the parameterizing of dozens of
> function call sites, 3-4 new traits (because each struct like
> Pte/Pde/DualPde etc each need their own trait which different MMU versions
> implement) etc. The code because hard to read and readability is the top
> critical criteria for me - I am personally strictly against "Lets use shiny
> features in language at the cost of making code unreadable". Because that
> translates into bugs and nightmare for maintainability.

After a quick look I'd say that having a trait here would actually be
*good* for correctness and maintainability.

The current design implies that every operation on a page table (most
likely using the walker) goes through a branching point. Just looking at
`PtWalk::read_pte_at_level`, there are already at least 6
`if version == 2 { } else { }` branches that all resolve to the same
result. Include walking down the PDEs and you have at least a dozen of
these just to resolve a virtual address. I know CPUs are fast, but this
is still wasted cycles for no good reason.

If you use a trait here, and make `PtWalk` generic against it, you can
optimize this away. We had a similar situation when we introduced Turing
support and the v2 ucode header, and tried both approaches: the
trait-based one was slightly shorter, and arguably more readable.

But the main argument to use a trait here IMO is that it enables
associated types and constants. That's particularly critical since some
equivalent fields have different lengths between v2 and v3. An
associated `Bounded` type for these would force the caller to validate
the length of these fields before calling a non-fallible operation,
which is exactly the level of caution that we want when dealing with
page tables.

In order to fully benefit from it, we will need the bitfield macro from
the `kernel` crate so the PDE/PTE fields can be `Bounded`, I will try to
make it available quickly in a patch that you can depend on.

But long story short, and although I need to dive deeper into the code,
this looks like a good candidate for using a trait and associated types.