Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs

From: Nicolas Dufresne
Date: Wed Feb 05 2025 - 13:14:43 EST

Next message: Yosry Ahmed: "Re: A path forward to cleaning up dying cgroups?"
Previous message: Aboorva Devarajan: "[PATCH 1/1] cpufreq: prevent NULL dereference in cpufreq_online"
In reply to: Maxime Ripard: "Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs"
Next in thread: Boris Brezillon: "Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Le mercredi 05 février 2025 à 15:52 +0100, Maxime Ripard a écrit :
> On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:
> > Hi Maxime, Nicolas
> >
> > On 30/01/2025 17:47, Nicolas Dufresne wrote:
> > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :
> > > > Hi Nicolas,
> > > >
> > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
> > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> > > > > > Hi,
> > > > > >
> > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > >
> > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > Mali Panthor CSF kernel driver.
> > > > > > >
> > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > >
> > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > > heap from which the driver can allocate a protected buffer.
> > > > > > > It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > > Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > > Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > > driver must import the protected memory from a device specific exporter.
> > > > > >
> > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > >
> > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > allocation through the usual dma_alloc_* API?
> > > > >
> > > > > How do you then multiplex this region so it can be shared between
> > > > > GPU/Camera/Display/Codec drivers and also userspace ?
> > > >
> > > > You could point all the devices to the same reserved memory region, and
> > > > they would all allocate from there, including for their userspace-facing
> > > > allocations.
> > >
> > > I get that using memory region is somewhat more of an HW description, and
> > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > that can only be done in the secure application. I can imagine similar needs
> > > when the protection is done using some sort of a VM / hypervisor.
> > >
> > > Nicolas
> > >
> >
> > The idea in this design is to abstract the heap management from the
> > Panthor kernel driver (which consumes a DMA buffer from it).
> >
> > In a system, an integrator would have implemented a secure heap driver,
> > and could be based on TEE or a carved-out memory with restricted access,
> > or else. This heap driver would be responsible of implementing the
> > logic to: allocate, free, refcount, etc.
> >
> > The heap would be retrieved by the Panthor kernel driver in order to
> > allocate protected memory to load the FW and allow the GPU to enter/exit
> > protected mode. This memory would not belong to a user space process.
> > The driver allocates it at the time of loading the FW and initialization
> > of the GPU HW. This is a device globally owned protected memory.
>
> The thing is, it's really not clear why you absolutely need to have the
> Panthor driver involved there. It won't be transparent to userspace,
> since you'd need an extra flag at allocation time, and the buffers
> behave differently. If userspace has to be aware of it, what's the
> advantage to your approach compared to just exposing a heap for those
> secure buffers, and letting userspace allocate its buffers from there?

Unless I'm mistaken, the Panthor driver loads its own firmware. Since loading
the firmware requires placing the data in a protected memory region, and that
this aspect has no exposure to userspace, how can Panthor not be implicated ?

>
> > When I came across this patch series:
> > -
> > https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@xxxxxxxxxxxx/#t
> > I found it could help abstract the interface between the secure heap and
> > the integration of protected memory in Panthor.
> >
> > A kernel driver would have to find the heap: `dma_heap_find()`, then
> > request allocation of a DMA buffer from it. The heap driver would deal
> > with the specifities of the protected memory on the system.
>
> Sure, but we still have to address *why* it would be a good idea for the
> driver to do it in the first place. The mediatek series had the same
> feedback.

Which got pretty clear replies iirc. The drivers needs scratch buffers and
secondary buffers to be protected, and these are not visible to userspace. No
one have made a proper counter argument yet. In MTK, the remote-proc driver for
the SCP (a multi-purpose multimedia co-processor) will also need to place the
firmware data into a protected buffer (with the help of the tee to copy the data
into it of course).

Nicolas

>
> Maxime

Next message: Yosry Ahmed: "Re: A path forward to cleaning up dying cgroups?"
Previous message: Aboorva Devarajan: "[PATCH 1/1] cpufreq: prevent NULL dereference in cpufreq_online"
In reply to: Maxime Ripard: "Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs"
Next in thread: Boris Brezillon: "Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]