Re: [PATCH v15 2/8] remoteproc: Add TEE support
From: Arnaud POULIQUEN
Date: Wed Mar 05 2025 - 07:50:44 EST
On 3/4/25 16:58, Bjorn Andersson wrote:
> On Wed, Feb 12, 2025 at 02:42:28PM +0100, Arnaud POULIQUEN wrote:
>> Hello,
>>
>> On 2/12/25 04:18, Bjorn Andersson wrote:
>>> On Tue, Dec 10, 2024 at 09:57:40AM +0100, Arnaud POULIQUEN wrote:
>>>> Hello Bjorn,
>>>>
>>>> On 12/6/24 23:07, Bjorn Andersson wrote:
>>>>> On Thu, Nov 28, 2024 at 09:42:09AM GMT, Arnaud Pouliquen wrote:
>>>>>> Add a remoteproc TEE (Trusted Execution Environment) driver
>>>>>> that will be probed by the TEE bus. If the associated Trusted
>>>>>> application is supported on secure part this driver offers a client
>>>>>> interface to load a firmware by the secure part.
>>>>>
>>>>> If...else?
>>>>>
>>>>>> This firmware could be authenticated by the secure trusted application.
>>>>>>
>>>>>
>>>>> I would like for this to fully describe how this fits with the bus and
>>>> Are you speaking about the OP-TEE bus?
>>>>
>>>> I assume that your attempt is that I provide more details on the live cycle
>>>> sequence, right?
>>>>
>>>
>>> Right, there's a tee_client_driver and there's a remoteproc driver.
>>> Let's document clearly how these interact.
>>>
>>>>> how it is expected to be used by a specific remoteproc driver.
>>>>>
>>>>>> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@xxxxxxxxxxx>
>>>>>> ---
>>>>>> Updates vs version v13:
>>>>>> - define REMOTEPROC_TEE as bool instead of tristate,
>>>>>> - remove the load of the firmware in rproc_tee_parse_fw as we will ensure
>>>>>> that the firmware is loaded using the load_fw() operation.
>>>>>> ---
>>>>>> drivers/remoteproc/Kconfig | 10 +
>>>>>> drivers/remoteproc/Makefile | 1 +
>>>>>> drivers/remoteproc/remoteproc_tee.c | 508 ++++++++++++++++++++++++++++
>>>>>> include/linux/remoteproc.h | 4 +
>>>>>> include/linux/remoteproc_tee.h | 105 ++++++
>>>>>> 5 files changed, 628 insertions(+)
>>>>>> create mode 100644 drivers/remoteproc/remoteproc_tee.c
>>>>>> create mode 100644 include/linux/remoteproc_tee.h
>>>>>>
>>>>>> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
>>>>>> index 955e4e38477e..f6335321d540 100644
>>>>>> --- a/drivers/remoteproc/Kconfig
>>>>>> +++ b/drivers/remoteproc/Kconfig
>>>>>> @@ -23,6 +23,16 @@ config REMOTEPROC_CDEV
>>>>>>
>>>>>> It's safe to say N if you don't want to use this interface.
>>>>>>
>>>>>> +config REMOTEPROC_TEE
>>>>>> + bool "Remoteproc support by a TEE application"
>>>>>> + depends on OPTEE
>>>>>> + help
>>>>>> + Support a remote processor with a TEE application.
>>>>>
>>>>> Does the remote processor run TEE applications? (Rethorical question...)
>>>>>
>>>>>> The Trusted
>>>>>> + Execution Context is responsible for loading the trusted firmware
>>>>>> + image and managing the remote processor's lifecycle.
>>>>>> +
>>>>>> + It's safe to say N if you don't want to use remoteproc TEE.
>>>>>
>>>>> It's not really about "wanting to use", it's a question whether your
>>>>> device implements/provides the remoteproc TEE.
>>>>>
>>>>>> +
>>>>>> config IMX_REMOTEPROC
>>>>>> tristate "i.MX remoteproc support"
>>>>>> depends on ARCH_MXC
>>>>>> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
>>>>>> index 5ff4e2fee4ab..f77e0abe8349 100644
>>>>>> --- a/drivers/remoteproc/Makefile
>>>>>> +++ b/drivers/remoteproc/Makefile
>>>>>> @@ -11,6 +11,7 @@ remoteproc-y += remoteproc_sysfs.o
>>>>>> remoteproc-y += remoteproc_virtio.o
>>>>>> remoteproc-y += remoteproc_elf_loader.o
>>>>>> obj-$(CONFIG_REMOTEPROC_CDEV) += remoteproc_cdev.o
>>>>>> +obj-$(CONFIG_REMOTEPROC_TEE) += remoteproc_tee.o
>>>>>> obj-$(CONFIG_IMX_REMOTEPROC) += imx_rproc.o
>>>>>> obj-$(CONFIG_IMX_DSP_REMOTEPROC) += imx_dsp_rproc.o
>>>>>> obj-$(CONFIG_INGENIC_VPU_RPROC) += ingenic_rproc.o
>>>>>> diff --git a/drivers/remoteproc/remoteproc_tee.c b/drivers/remoteproc/remoteproc_tee.c
>>>>>> new file mode 100644
>>>>>> index 000000000000..3fe3f31068f2
>>>>>> --- /dev/null
>>>>>> +++ b/drivers/remoteproc/remoteproc_tee.c
>>>>>> @@ -0,0 +1,508 @@
>>>>>> +// SPDX-License-Identifier: GPL-2.0-or-later
>>>>>> +/*
>>>>>> + * Copyright (C) STMicroelectronics 2024
>>>>>> + * Author: Arnaud Pouliquen <arnaud.pouliquen@xxxxxxxxxxx>
>>>>>> + */
>>>>>> +
>>>>>> +#include <linux/firmware.h>
>>>>>> +#include <linux/io.h>
>>>>>> +#include <linux/module.h>
>>>>>> +#include <linux/remoteproc.h>
>>>>>> +#include <linux/remoteproc_tee.h>
>>>>>> +#include <linux/slab.h>
>>>>>> +#include <linux/tee_drv.h>
>>>>>> +
>>>>>> +#define MAX_TEE_PARAM_ARRAY_MEMBER 4
>>>>>> +
>>>>>> +/*
>>>>>> + * Authentication of the firmware and load in the remote processor memory
>>>>>
>>>>> Exactly what does this imply? Will the content of @memref be copied into
>>>>> some other memory?
>>>>
>>>> The objective is to authenticate and load in one step. So, yes, the image is
>>>> loaded into the remoteproc destination memory.
>>>>
>>>
>>> So, some separate device-memory, or some preallocated carveout which is
>>> only accessible from secure world?
>>
>> No sure to understand the difference you do between eparate device-memory and
>> preallocated carveout.
>>
>
> The main clarification I was looking for was that in your design you
> don't use any resources on the Linux side for when your remoteproc
> instance is running - i.e. no carveouts etc on the Linux side.
>
>> In OP-TEE, we use the same principle as in Linux. OP-TEE uses memory regions
>> declared in its device tree to list memories usable for the coprocessor (with
>> associated access rights). On load, it checks that the segments to load are
>> included in these memory regions.
>>
>> Linux only declares the shared memory-regions in the device tree, for IPC.
>>
>>>
>>> Does the OS need to retain @memref past this point?
>>
>> No need,and as the area contains the reult of request_firmware() that can be
>> corrupted by Linux, OP-TEE considered this as a temporaray unsafe memory. After
>> the load + authentication step this buffer is no more used.
>> For detail, OPTEE make a copy of the header and TLV (metadata) in a secure
>> memory. and load the firmware images in destination memories All these memories
>> are not accessible from the Linux.
>>
>
> No concerns with this, but these semantics should be clearly documented
> here.
>
>>>
>>>> On stm32mp1 we can not store the elf file in a temporary secure memory as
>>>> the memory is encrypted by software (this would take to much time).
>>>>
>>>> For your information, in OP-TEE, the application code is split into a generic
>>>> part and a platform adaptation layer. The generic application is mainly
>>>> responsible for:
>>>>
>>>> - Copying the binary header and metadata into secure memory and authenticating them.
>>>> - Parsing the ELF images and providing segments to load with associated
>>>> authenticated hashes to the platform application.
>>>> In the future, someone can add their own format if needed.
>>>>
>>>> But the generic part could be enhance to authenticate and load a non ELF binary.
>>>> So I'm trying to be generic as possible here.
>>>>
>>>
>>> Generic might be okay, but I'd prefer this to be less vague.
>>> Also worth noting is the Qualcomm implementation of TZ-backed
>>> remoteproc, which is already in the tree.
>>
>> Could you point me the code in Linux and your TEE, please?
>>
>
> One example is drivers/remoteproc/qcom_q6v5_pas.c where this is captured
> in adsp_start().
>
> qcom_mdt_pas_init() parses out the ELF header and signature information
> and passes this to the secure world, it then loads the segments of the
> ELF into the carvouts (qcom_mdt_load_no_init()) and finally it jumps to
> secure world with qcom_scm_pas_auth_and_reset(), which will lock down
> Linux's access to the carveouts, then based on previously established
> data will authenticate the loaded firmware and finally start execution
> on the remote processor.
>
> The difference in this model though is that we don't need the resource
> table for rproc_handle_resources() - so this doesn't meet your needs.
>
>>> There the firmware is loaded
>>> into carveouts, the certificates and hashes are validated.
>>
>> Seems to me that there is also a partial Authentication done during the load step.
>>
>
> Given that the ELF header and signature information is vetted before the
> actually copy the segments into carveouts, it's conceivable that the ELF
> header could be sanity checked...
>
>>> Lastly
>>> the operation "authenticate and start" is invoked, which does that, and
>>> locks the OS out of the given memory region - until "shutdown" is
>>> invoked.
>>
>> The differnce between the 2 implementations is the authentication method done in
>> 2 steps for Qualcomm implementation , in one step for OP-TEE.
>>
>
> Yes, but it needs to be pointed out that this is because you want the
> resource table to be authenticated.
>
>> So here if I just remove the term 'authentication' in the command description
>> does it ok for you?
>>
>
> No, perhaps I'm misinterpreting you here; but the goal isn't to play
> word games until it's good enough - the goal is to have a clean design
> that will cover the various cases, and for that we need to establish
> what your actual requirements on the host OS side is (typically while
> considering the "other side" to be a black box).
>
>>>
>>>>
>>>>>
>>>>>> + *
>>>>>> + * [in] params[0].value.a: unique 32bit identifier of the remote processor
>>>>>
>>>>> Why not just "remote processor identifier"?
>>>>>
>>>>>> + * [in] params[1].memref: buffer containing the image of the buffer
>>>>>> + */
>>>>>> +#define TA_RPROC_FW_CMD_LOAD_FW 1
>>>>>> +
>>>>>> +/*
>>>>>> + * Start the remote processor
>>>>>> + *
>>>>>> + * [in] params[0].value.a: unique 32bit identifier of the remote processor
>>>>>> + */
>>>>>> +#define TA_RPROC_FW_CMD_START_FW 2
>>>>>
>>>>> Why is there two "FW" in this constant? Why isn't it just
>>>>> "TA_RPROC_FW_CMD_START"?
>>>>>
>>>>> And why is it not TEE_PROC_FW_CMD_START?
>>>>>
>>>>>> +
>>>>>> +/*
>>>>>> + * Stop the remote processor
>>>>>> + *
>>>>>> + * [in] params[0].value.a: unique 32bit identifier of the remote processor
>>>>>> + */
>>>>>> +#define TA_RPROC_FW_CMD_STOP_FW 3
>>>>>> +
>>>>>> +/*
>>>>>> + * Return the address of the resource table, or 0 if not found
>>>>>> + * No check is done to verify that the address returned is accessible by
>>>>>> + * the non secure context. If the resource table is loaded in a protected
>>>>>> + * memory the access by the non secure context will lead to a data abort.
>>>>>
>>>>> These three lines describe a scenario that doesn't make any sense to me.
>>>>> But if that's the case, you should be able to describe that the API
>>>>> might give you a inaccessible pointer, by design.
>>>>
>>>> On STM32MP, we have a kind of firewall in OP-TEE that sets memory access rights
>>>> from the device tree. So if the firmware image is not linked according to the
>>>> firewall configuration, the pointer may not be accessible.
>>>>
>>>> I will update the comment as you propose.
>>>>
>>>>>
>>>>>> + *
>>>>>> + * [in] params[0].value.a: unique 32bit identifier of the remote processor
>>>>>> + * [out] params[1].value.a: 32bit LSB resource table memory address
>>>>>> + * [out] params[1].value.b: 32bit MSB resource table memory address
>>>>>> + * [out] params[2].value.a: 32bit LSB resource table memory size
>>>>>> + * [out] params[2].value.b: 32bit MSB resource table memory size
>>>>>> + */
>>>>>> +#define TA_RPROC_FW_CMD_GET_RSC_TABLE 4
>>>>>> +
>>>>>> +/*
>>>>>> + * Return the address of the core dump
>>>>>
>>>>> What does this mean? What will I find at @memref after this call?
>>>>
>>>> I do not have a simple answer here as it depends on the OP-TEE strategy.
>>>> It could be an obscure core dump with possible encryption.
>>>>
>>>> I will remove this as it is not yet implemented in OP-TEE.
>>>>
>>>
>>> Okay. But I would prefer that we define the semantics before it's
>>> implemented...
>>
>> that seems fair, I notice that we will have to address this in a separate thread
>> strating with a series in Linux.
>>
>>
>>>
>>>> https://elixir.bootlin.com/op-tee/4.4.0/source/ta/remoteproc/src/remoteproc_core.c#L1131
>>>>
>>>>>
>>>>>> + *
>>>>>> + * [in] params[0].value.a: unique 32bit identifier of the remote processor
>>>>>> + * [out] params[1].memref: address of the core dump image if exist,
>>>>>> + * else return Null
>>>>>
>>>>> s/else return Null/or NULL/
>>>>>
>>>>>> + */
>>>>>> +#define TA_RPROC_FW_CMD_GET_COREDUMP 5
>>>>>> +
>>>>>> +/*
>>>>>> + * Release remote processor firmware images and associated resources.
>>>>>
>>>>> Exactly what does this mean for the caller?
>>>>
>>>> It is platform dependent. It can consist in cleaning the memory, but
>>>> can be also something else such as firewall configuration.
>>>> On stm323mp we clean all the memories region reserved for the remote processor.
>>>>
>>>
>>> We can't have an ABI which isn't well defined in intent. Your examples
>>> would easily fall in the realm of a well defined interface, but this
>>> ties into the question above - what does is actually mean in terms of
>>> the memory carveouts and such.
>>>
>>
>> Regarding this comment and the one below, does following description would
>> respond to your expectations? else do you have a suggestion?
>>
>> /*
>> * This command should be used in case an error occurs between the loading of
>> * the firmware images (TA_RPROC_CMD_LOAD_FW) and the starting of the remote
>> * processor (TA_RPROC_CMD_START_FW),
>
> This is valuable information related to TA_RPROC_CMD_LOAD_FW and
> TA_RPROC_CMD_START_FW, so let's document this there instead.
>
>> * or after stopping the remote processor
>> * (TA_RPROC_CMD_STOP_FW).
>> *
>> * This command is used to inform the TEE (Trusted Execution Environment) that
>> * resources associated with the remote processor can be released. After this
>> * command, the firmware is no longer present in the remote processor's memories
>> * and must be reloaded (TA_RPROC_FW_CMD_LOAD_FW) to restart the remote
>> * processor.
>
> I guess it's fine to define it like this on this level, but in the
> remoteproc core I'd like us to express the related logic as "release
> resources allocated durign rproc_parse_fw(). (And I don't think these
> two interpretations are in conflict).
>
> What's unexpected to me then is that you're not actually reloading your
> firmware across a recovery/restart?
I do it. in rproc_boot_recovery()
- we call on rproc_stop()
rproc_reset_rsc_table_on_stop() copy the resource table in
rproc->cached_table
- we call rproc_load_fw() added in patch 3/8
- we call rproc_start() which overwrite the resource table with values in
proc->cached_table
The proc->cached_table avoid to release and reallocate all the carveout on recovery.
This management is one of the points that complexity the sequence in the
remoteproc_tee case.
Thanks,
Arnaud
>
> Regards,
> Bjorn