Re: [PATCH v8 05/31] gpu: nova-core: set DMA mask width based on GPU architecture

From: John Hubbard

Date: Wed Mar 25 2026 - 21:22:52 EST

On 3/25/26 6:56 AM, Alexandre Courbot wrote:
> On Wed Mar 25, 2026 at 10:38 PM JST, Danilo Krummrich wrote:
>> On Wed Mar 25, 2026 at 12:45 PM CET, Alexandre Courbot wrote:
>>> On Wed Mar 25, 2026 at 8:31 PM JST, Danilo Krummrich wrote:
>>>> On Wed Mar 25, 2026 at 4:52 AM CET, John Hubbard wrote:
>>>>> Move Spec creation into probe() so the architecture is known before
>>>>> setting the DMA mask, and pass Spec into Gpu::new().
>>>>
>>>> Hm...this is not what we discussed in [1]. You implemented the change in v5 and
>>>> v6 and v7 reverted it. Why?
>>>>
>>>> [1] https://lore.kernel.org/all/DGC28K8J39E2.1X173NHT9ZJDI@xxxxxxxxxx/
>>>
>>> That was on my feedback [1].
>>>
>>> `Gpu::new` carries no guarantee in itself that no DMA ops have been
>>> performed so far; whereas in `probe` this is much clearer.
>>
>> That's not what has to be justified. Having DMA allocations before setting the
>> mask is not a violation of the safety contract; it is about concurrent
>> allocations.
>>
>> That said, all DMA allocations are on the Gpu object, so there can't be
>> concurrent DMA allocations while we are still in the constructor of the Gpu
>> object.
>>
>> Unless...one would try to create a second Gpu object from probe() (which
>> obviously makes zero sense). nova-core already relies on Gpu to be unique per
>> device, as attempting to create a second Gpu object causes UB and actual memory
>> corruption when there is no IOMMU. :)
>>
>> [ 123.966391] pci 0000:01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000e address=0xf7fef000 flags=0x0000]
>>
>> So, I think technically Gpu::new() already has the safety requirement that it
>> must not be constructed more than once per device.
>
> Yeah, I guess that's sufficient indeed. John, my apologies for the
> back-and-forth.

Not a problem! :)

thanks,
--
John Hubbard