RE: RE: [EXT] Re: [PATCH 4/4] firmware: imx: add driver for NXP EdgeLock Enclave

From: Pankaj Gupta
Date: Tue May 21 2024 - 07:57:35 EST




> -----Original Message-----
> From: Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx>
> Sent: Monday, May 20, 2024 4:32 PM
> To: Pankaj Gupta <pankaj.gupta@xxxxxxx>
> Cc: Jonathan Corbet <corbet@xxxxxxx>; Rob Herring <robh+dt@xxxxxxxxxx>;
> Krzysztof Kozlowski <krzysztof.kozlowski+dt@xxxxxxxxxx>; Conor Dooley
> <conor+dt@xxxxxxxxxx>; Shawn Guo <shawnguo@xxxxxxxxxx>; Sascha Hauer
> <s.hauer@xxxxxxxxxxxxxx>; Pengutronix Kernel Team
> <kernel@xxxxxxxxxxxxxx>; Fabio Estevam <festevam@xxxxxxxxx>; linux-
> doc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> devicetree@xxxxxxxxxxxxxxx; imx@xxxxxxxxxxxxxxx; linux-arm-
> kernel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: RE: [EXT] Re: [PATCH 4/4] firmware: imx: add driver for NXP
> EdgeLock Enclave
>
> On 17.05.2024 11:24:46, Pankaj Gupta wrote:
> > > > new file mode 100644
> > > > index 000000000000..0463f26d93c7
> > > > --- /dev/null
> > > > +++ b/drivers/firmware/imx/ele_base_msg.c
> > > > @@ -0,0 +1,287 @@
> > > > +// SPDX-License-Identifier: GPL-2.0+
> > > > +/*
> > > > + * Copyright 2024 NXP
> > > > + */
> > > > +
> > > > +#include <linux/types.h>
> > > > +#include <linux/completion.h>
> > > > +#include <linux/dma-mapping.h>
> > > > +
> > > > +#include "ele_base_msg.h"
> > > > +#include "ele_common.h"
> > > > +
> > > > +int ele_get_info(struct device *dev, struct soc_info *s_info) {
> > > > + struct se_if_priv *priv = dev_get_drvdata(dev);
> > > > + struct se_api_msg *tx_msg __free(kfree);
> > > > + struct se_api_msg *rx_msg __free(kfree);
> > > > + phys_addr_t get_info_addr;
> > > > + u32 *get_info_data;
> > > > + u32 status;
> > > > + int ret;
> > > > +
> > > > + if (!priv || !s_info)
> > > > + goto exit;
> > >
> > > You should code properly, so that this doesn't happen, your cleanup
> > > is broken, it will work on uninitialized data, as Sascha already mentioned.
> >
> > The API(s) part of this file will be later exported and might get used by
> driver/crypto/ele/*.c.
> > Still if you think, this check should be removed, I will do it in v2.
>
> It makes no sense to call these functions with NULL pointers, if you do so, it's
> a mistake by the caller. If it's used by some other part of the ele driver that
> should be coded properly.
>
Will remove this change in v2.

> > > > +
> > > > + memset(s_info, 0x0, sizeof(*s_info));
> > > > +
> > > > + if (priv->mem_pool_name)
> > > > + get_info_data = get_phy_buf_mem_pool(dev,
> > > > + priv->mem_pool_name,
> > > > + &get_info_addr,
> > > > + ELE_GET_INFO_BUFF_SZ);
> > > > + else
> > > > + get_info_data = dmam_alloc_coherent(dev,
> > > > + ELE_GET_INFO_BUFF_SZ,
> > > > + &get_info_addr,
> > > > + GFP_KERNEL);
> > >
> > > It's better style to move the init of the dma memory into the probe
> > > function.
> >
> > It is not DMA init. It is DMA allocation.
>
> It's better style to move the allocation of the dma memory into the probe
> function.
>
The buffer 'get_info_data', is allocated and freed within this function.
This API is called multiple times:
- as part of probe.
- as part of suspend/resume.

Why to keep the memory retained?

>
> [...]
>
> > > > + priv->rx_msg = rx_msg;
> > > > + ret = imx_ele_msg_send_rcv(priv, tx_msg);
> > >
> > > This API looks strange, why put the tx_msg as a parameter the rx_msg
> > > into the private struct?
> >
> > The rx_msg is the populated in the interrupt context. Hence, it kept
> > as part of private structure; which is in-turn associated with
> > mbox_client.
>
> These are implementation details, it just feels strange to pass one parameter
> via an arguments and put the other in the private pointer.
>
> > Though, in v2 moving the rx_msg setting to imx_ele_msg_send_rcv(priv,
> > tx_msg, rx_msg);
>
> fine
>
> [...]
>
> > > > + if (status != priv->success_tag) {
> > > > + dev_err(dev, "Command Id[%d], Response Failure = 0x%x",
> > > > + ELE_GET_INFO_REQ, status);
> > > > + ret = -1;
> > > > + }
> > > > +
> > > > + s_info->imem_state = (get_info_data[ELE_IMEM_STATE_WORD]
> > > > + & ELE_IMEM_STATE_MASK) >> 16;
> > >
> > > can you use a struct for get_info_data and use FIELD_GET() (if
> > > needed)
> >
> > Re-write the structure soc_info, matching the information provided in
> > response to this api.
>
> Looks better. Please compile the driver and check with "pahole" that the
> layout of these structures doesn't contain any unwanted padding.
> Otherwise add "__packed" and if you can guarantee "__aligned(4)".
>

Structure is copied from the already deployed user-space library.
Validated that each variable is printing correct value.
Even below code changes are done:
- tx_msg->data[2] = ELE_GET_INFO_READ_SZ;
+ tx_msg->data[2] = sizeof(struct soc_info);

> > struct dev_info {
> > uint8_t cmd;
> > uint8_t ver;
> > uint16_t length;
> > uint16_t soc_id;
> > uint16_t soc_rev;
> > uint16_t lmda_val;
> > uint8_t ssm_state;
> > uint8_t dev_atts_api_ver;
> > uint8_t uid[MAX_UID_SIZE];
> > uint8_t sha_rom_patch[DEV_GETINFO_ROM_PATCH_SHA_SZ];
> > uint8_t sha_fw[DEV_GETINFO_FW_SHA_SZ]; };
> >
> > struct dev_addn_info {
> > uint8_t oem_srkh[DEV_GETINFO_OEM_SRKH_SZ];
> > uint8_t trng_state;
> > uint8_t csal_state;
> > uint8_t imem_state;
> > uint8_t reserved2;
> > };
> >
> > struct soc_info {
> > struct dev_info d_info;
> > struct dev_addn_info d_addn_info; };
>
> [...]
>
> > > > +int imx_ele_msg_send(struct se_if_priv *priv, void *mssg) {
> > > > + bool is_cmd_lock_tobe_taken = false;
> > > > + int err;
> > > > +
> > > > + if (!priv->waiting_rsp_dev || priv->no_dev_ctx_used) {
> > > > + is_cmd_lock_tobe_taken = true;
> > > > + mutex_lock(&priv->se_if_cmd_lock);
> > > > + }
> > > > + scoped_guard(mutex, &priv->se_if_lock);
> > > > +
> > > > + err = mbox_send_message(priv->tx_chan, mssg);
> > > > + if (err < 0) {
> > > > + dev_err(priv->dev, "Error: mbox_send_message failure.\n");
> > > > + if (is_cmd_lock_tobe_taken)
> > > > + mutex_unlock(&priv->se_if_cmd_lock);
> > >
> > > Only dropping the lock in case of failure doesn't look right to me.
> >
> > The callers of this function, takes the execution flow to aborting the
> > operation on getting return code < 0. No next action is expected under
> > this aborted operation. Unlocking the lock here is not an issue
> >
> > > It seems you should better move the lock to the callers of this function.
> >
> > Accepted, and moved to the caller of the function for:
> > - locking
> > - unlocking in case of error.
> >
> > Unlocking in the read API, once response is successfully received and
> > read.
>
> A better design would be: imx_ele_msg_rcv() imx_ele_msg_send() are
> expected to be called locked. Add lockdep_assert_held() to these function to
> document/check this.
>
> The callers of imx_ele_msg_rcv() and imx_ele_msg_send() have to take care
> of the locking.
>
> [...]
>
The locking/unlocking of se_if_cmd_lock, is taken care by the callers only:
- imx_ele_msg_send_rcv calls both the functions:
--imx_ele_msg_send.
--imx_ele_msg_rcv.

But the lockdep_assert_held, cannot be added to imx_ele_msg_send, as its another caller function imx_ele_miscdev_msg_send calls if for sending:
--- command (here command lock is taken).
--- response to a command (here command lock is not taken).

Will add lockdep_assert_held, to receive path, in v2.

> > > > +static const struct imx_se_node_info_list imx8ulp_info = {
> > > > + .num_mu = 1,
> > > > + .soc_id = SOC_ID_OF_IMX8ULP,
> > > > + .info = {
> > > > + {
> > > > + .se_if_id = 2,
> > > > + .se_if_did = 7,
> > > > + .max_dev_ctx = 4,
> > > > + .cmd_tag = 0x17,
> > > > + .rsp_tag = 0xe1,
> > > > + .success_tag = 0xd6,
> > > > + .base_api_ver = MESSAGING_VERSION_6,
> > > > + .fw_api_ver = MESSAGING_VERSION_7,
> > > > + .se_name = "hsm1",
> > > > + .mbox_tx_name = "tx",
> > > > + .mbox_rx_name = "rx",
> > > > + .pool_name = "sram",
> > > > + .fw_name_in_rfs = IMX_ELE_FW_DIR\
> > > ^
> > > not
> > > needed
> >
> > It is needed for i.MX8ULP, dual FW support.
>
> The backslash is not needed.
Accepted. Will correct in v2.

>
> >
> > > > + "mx8ulpa2ext-ahab-
> container.img",
> >
> >
> > > > + .soc_register = true,
> > > > + .reserved_dma_ranges = true,
> > > > + .imem_mgmt = true,
> > > > + },
> > > > + },
> > > > +};
> > > > +
> > > > +static const struct imx_se_node_info_list imx93_info = {
> > > > + .num_mu = 1,
> > > > + .soc_id = SOC_ID_OF_IMX93,
> > > > + .info = {
> > > > + {
> > > > + .se_if_id = 2,
> > > > + .se_if_did = 3,
> > > > + .max_dev_ctx = 4,
> > > > + .cmd_tag = 0x17,
> > > > + .rsp_tag = 0xe1,
> > > > + .success_tag = 0xd6,
> > > > + .base_api_ver = MESSAGING_VERSION_6,
> > > > + .fw_api_ver = MESSAGING_VERSION_7,
> > > > + .se_name = "hsm1",
> > > > + .mbox_tx_name = "tx",
> > > > + .mbox_rx_name = "rx",
> > > > + .reserved_dma_ranges = true,
> > > > + .imem_mgmt = true,
> > > > + .soc_register = true,
> > > > + },
> > > > + },
> > >
> > >
> > > Some (most?) members of these structs are the same. Why do you have
> > > this abstraction if it's not needed right now?
> >
> > It is needed as the values is different for different NXP SoC
> > compatible. It will be needed for NXP i.MX95 platform, whose code will
> > be next in pipeline.
>
> How does the imx95 .info look like?
>
Copied from the internal repo.
static const struct imx_info_list imx95_info = {
.num_mu = 4,
.soc_id = SOC_ID_OF_IMX95,
.info = {
{
.socdev = false,
.mu_id = 2,
.mu_did = 3,
.max_dev_ctx = 4,
.cmd_tag = 0x17,
.rsp_tag = 0xe1,
.success_tag = 0xd6,
.base_api_ver = MESSAGING_VERSION_6,
.fw_api_ver = MESSAGING_VERSION_7,
.se_name = "hsm1",
.mbox_tx_name = "tx",
.mbox_rx_name = "rx",
.pool_name = NULL,
.reserved_dma_ranges = false,
.init_fw = true,
.v2x_state_check = true,
.start_rng = ele_start_rng,
.enable_ele_trng = true,
.imem_mgmt = false,
.mu_buff_size = 0,
.fw_name_in_rfs = NULL,
},
{
.socdev = false,
.mu_id = 0,
.mu_did = 0,
.max_dev_ctx = 0,
.cmd_tag = 0x17,
.rsp_tag = 0xe1,
.success_tag = 0xd6,
.base_api_ver = 0x2,
.fw_api_ver = 0x2,
.se_name = "v2x_dbg",
.pool_name = NULL,
.mbox_tx_name = "tx",
.mbox_rx_name = "rx",
.reserved_dma_ranges = false,
.init_fw = false,
.v2x_state_check = true,
.start_rng = v2x_start_rng,
.enable_ele_trng = false,
.imem_mgmt = false,
.mu_buff_size = 0,
.fw_name_in_rfs = NULL,
},
{
.socdev = false,
.mu_id = 4,
.mu_did = 0,
.max_dev_ctx = 4,
.cmd_tag = 0x18,
.rsp_tag = 0xe2,
.success_tag = 0xd6,
.base_api_ver = 0x2,
.fw_api_ver = 0x2,
.se_name = "v2x_sv0",
.pool_name = NULL,
.mbox_tx_name = "tx",
.mbox_rx_name = "rx",
.reserved_dma_ranges = false,
.init_fw = false,
.v2x_state_check = true,
.start_rng = NULL,
.enable_ele_trng = false,
.imem_mgmt = false,
.mu_buff_size = 16,
.fw_name_in_rfs = NULL,
},
{
.socdev = false,
.mu_id = 6,
.mu_did = 0,
.max_dev_ctx = 4,
.cmd_tag = 0x1a,
.rsp_tag = 0xe4,
.success_tag = 0xd6,
.base_api_ver = 0x2,
.fw_api_ver = 0x2,
.se_name = "v2x_she",
.pool_name = NULL,
.mbox_tx_name = "tx",
.mbox_rx_name = "rx",
.reserved_dma_ranges = false,
.init_fw = false,
.v2x_state_check = true,
.start_rng = NULL,
.enable_ele_trng = false,
.imem_mgmt = false,
.mu_buff_size = 16,
.fw_name_in_rfs = NULL,
},
{
.socdev = false,
.mu_id = 6,
.mu_did = 0,
.max_dev_ctx = 4,
.cmd_tag = 0x1a,
.rsp_tag = 0xe4,
.success_tag = 0xd6,
.base_api_ver = 0x2,
.fw_api_ver = 0x2,
.se_name = "v2x_she",
.pool_name = NULL,
.mbox_tx_name = "tx",
.mbox_rx_name = "rx",
.reserved_dma_ranges = false,
.init_fw = false,
.v2x_state_check = true,
.start_rng = NULL,
.enable_ele_trng = false,
.imem_mgmt = false,
.mu_buff_size = 256,
.fw_name_in_rfs = NULL,
},
}
};
>
> > > > +static int imx_fetch_soc_info(struct device *dev) {
> > > > + struct se_if_priv *priv = dev_get_drvdata(dev);
> > > > + struct imx_se_node_info_list *info_list;
> > > > + const struct imx_se_node_info *info;
> > > > + struct soc_device_attribute *attr;
> > > > + struct soc_device *sdev;
> > > > + struct soc_info s_info;
> > > > + int err = 0;
> > > > +
> > > > + info = priv->info;
> > > > + info_list = (struct imx_se_node_info_list *)
> > > > + device_get_match_data(dev->parent);
> > >
> > > I think cast is not needed.
> >
> > It returns memory reference with const attribute. SoC revision member
> > of 'info_list', is required to be updated. Thus type casted.
>
> Have you considered that this memory is marked as const for a reason?
> It's const, you cannot change it. Place any values that have to changed into
> your priv.

Created a static variable g_soc_rev in the se_ctrl.c.
Accepted and will correct it in v2.

>
> > > > + if (info_list->soc_rev)
> > > > + return err;
> > >
Will change the above condition to g_soc_rev.

> > > What does this check do? You'll only get data you put in the
> > > info_list in the first place.
>
> > info_list->soc_rev, is equal to zero for the first call to this
> > function. To return from this function if this function is already
> > executed.
>
> This looks wrong, see above.

Accepted and will correct it in v2.

>
> > > > + err = ele_get_info(dev, &s_info);
> > > > + if (err)
> > > > + s_info.major_ver = DEFAULT_IMX_SOC_VER;
> > >
> > > Why continue here in case of error?
> >
> > To continue with SoC registration for the default values (without
> > fetching information from ELE).
>
> Have you tested the driver that it will work, if this fails?
Tested in unit testing by making err equal to non-zero.
Showing soc revision and serial number are shown as zeros.

But, I agree with you to return failure. As there is no point continuing if the SE probe failed.
Earlier I was thinking to allow other modules depending on soc registration info, can work.

Accepted and will not continue in case of failure in V2.

>
> > > > +
> > > > + info_list->soc_rev = s_info.soc_rev;
> > > > +
> > > > + if (!info->soc_register)
> > > > + return 0;
> > > > +
> > > > + attr = devm_kzalloc(dev, sizeof(*attr), GFP_KERNEL);
> > > > + if (!attr)
> > > > + return -ENOMEM;
> > > > +
> > > > + if (s_info.minor_ver)
> > > > + attr->revision = devm_kasprintf(dev, GFP_KERNEL, "%x.%x",
> > > > + s_info.major_ver,
> > > > + s_info.minor_ver);
> > > > + else
> > > > + attr->revision = devm_kasprintf(dev, GFP_KERNEL, "%x",
> > > > + s_info.major_ver);
> > > > +
> > > > + switch (s_info.soc_id) {
> > > > + case SOC_ID_OF_IMX8ULP:
> > > > + attr->soc_id = devm_kasprintf(dev, GFP_KERNEL,
> > > > + "i.MX8ULP");
> > > > + break;
> > > > + case SOC_ID_OF_IMX93:
> > > > + attr->soc_id = devm_kasprintf(dev, GFP_KERNEL,
> > > > + "i.MX93");
> > > > + break;
> > > > + }
> > > > +
> > > > + err = of_property_read_string(of_root, "model",
> > > > + &attr->machine);
> > > > + if (err) {
> > > > + devm_kfree(dev, attr);
> > >
> > > Why do you do a manual cleanup of devm managed resources? Same
> > > applies to the other devm managed resources, too.
> > >
> > Used devm managed memory, as this function is called as part probe.
> > Post device registration, this devm managed memory is un-necessarily
> > blocked. It is better to release it as part of clean-up, under this
> > function only.
>
> Why do you allocate the memory with devm in the first place, if it's not
> needed after probe?

Sorry to confuse you. Actually the devm_memory will be needed for the case of soc_registration.
Meaning, memory with devm, will be needed post probing as well.

If this function fails, the probing will fail too. It will be auto cleaned.

Accepted, will remove the devm_free in v2.

>
> > Other devm managed memory clean-up, under se_probe_cleanup, will be
> > removed, as suggested.
>
> regards,
> Marc
>
> --
> Pengutronix e.K. | Marc Kleine-Budde |
> Embedded Linux | https://www.pengutronix.de |
> Vertretung Nürnberg | Phone: +49-5121-206917-129 |
> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-9 |