Re: [PATCH v10 3/4] tee: optee: support tracking system threads

From: Etienne CARRIERE - foss
Date: Fri Oct 13 2023 - 04:39:53 EST


> From: Sumit Garg <sumit.garg@xxxxxxxxxx>
> Sent: Friday, October 13, 2023 9:21 AM
>
> On Wed, 11 Oct 2023 at 12:41, Etienne CARRIERE - foss
> <etienne.carriere@xxxxxxxxxxx> wrote:
> >
> > > From: Sumit Garg <sumit.garg@xxxxxxxxxx>
> > > Sent: Friday, October 6, 2023 11:33 AM
> > >
> > > On Tue, 3 Oct 2023 at 19:36, Etienne Carriere
> > > <etienne.carriere@xxxxxxxxxxx> wrote:
> > > >
> > > > Adds support in the OP-TEE driver to keep track of reserved system
> > > > threads. The logic allows one OP-TEE thread to be reserved to TEE system
> > > > sessions.
> > > >
> > > > The optee_cq_*() functions are updated to handle this if enabled,
> > > > that is when TEE describes how many thread context it supports
> > > > and when at least 1 session has registered as a system session
> > > > (using tee_client_system_session()).
> > > >
> > > > For sake of simplicity, initialization of call queue management
> > > > is factorized into new helper function optee_cq_init().
> > > >
> > > > The SMC ABI part of the driver enables this tracking, but the
> > > > FF-A ABI part does not.
> > > >
> > > >
> > > > Co-developed-by: Jens Wiklander <jens.wiklander@xxxxxxxxxx>
> > > > Signed-off-by: Jens Wiklander <jens.wiklander@xxxxxxxxxx>
> > > > Co-developed-by: Sumit Garg <sumit.garg@xxxxxxxxxx>
> > > > Signed-off-by: Sumit Garg <sumit.garg@xxxxxxxxxx>
> > > > Signed-off-by: Etienne Carriere <etienne.carriere@xxxxxxxxxxx>
> > > > ---
> > > > Changes since v9:
> > > > - Add a reference counter for TEE system thread provisioning. We reserve
> > > > a TEE thread context for system session only when there is at least
> > > > 1 opened system session.
> > > > - Use 2 wait queue lists, normal_waiters and sys_waiter, as proposed in
> > > > patch v8. Using a single list can prevent a waiting system thread from
> > > > being resumed if the executing system thread wakes a normal waiter in
> > > > the list.
> > >
> > > How would that be possible? The system thread wakeup
> > > (free_thread_threshold = 0) is given priority over normal thread
> > > wakeup (free_thread_threshold = 1). I think a single queue list would
> > > be sufficient as demonstrated in v9.
> > >
> >
> > Hello Sumit,
> >
> > I think a system session can be trapped waiting when using a single queue list.
> > To have a chance to reach the TEE, a waiting thread must wait that a TEE thread comes out of the TEE and calls complete() on the waitqueue to wake next waiter.
> >
> > To illustrate, consider a 10 TEE threads configuration on TEE side (::total_thread_count=10 at init),
> > and several TEE clients in Linux OS, including 2 system sessions, from 2 consumer drivers (::sys_thread_req_count=2).
> >
> > Imagine the 9 normal threads and the 1 system thread are in use. (::free_thread_count=0),
> > Now comes the other system session: it goes to the waitqueue list.
> > Now comes a normal session invocation: it goes to the waitqueue list, 1st position.
> >
> > Now, TEE system thread returns to Linux:
> > It increments the counter, ::free_thread_count=1, and calls complete() for the waitequeue.
> > The 1st element in the waitqueue list is the last entered normal session invocation.
> > However, that waiter won't switch local boolean 'need_wait' to false because ::free_thread_count=1 and ::sys_thread_req_count!=0.
> > So no attempt to reach TEE and wake another waiter on return.
> > At that point there is a system session in the waitqueue list that could enter TEE (::free_thread_count=1) but is waiting someone returns from the TEE.
>
> I suppose the following loop tries to wake-up every waiter to give
> them a chance to enter OP-TEE. So with that system session would
> always be prefered over normal session, right?

No, the below loop will wake only the 1st waiter it finds in the list that is
current waiting (completion_done() returns false). So if it finds a normal
session, it will only wake this one which, in turn, will not try to reach the
TEE from the while(need_wiat) loop in optee_cq_wait_init(), because there is
not enough free threads. Because it does not reach the TEE, it will not
it wake another waiter.

>
> static void optee_cq_complete_one(struct optee_call_queue *cq)
> {
> struct optee_call_waiter *w;
>
> list_for_each_entry(w, &cq->waiters, list_node) {
> if (!completion_done(&w->c)) {
> complete(&w->c);
> break;
> }
> }
> }
>
> -Sumit
>

Note I've found a error in this patch v10, see below.

BR,
Etienne


> >
> > With 2 lists, we first treat system sessions to overcome that.
> > Am I missing something?
> >
> > Best regards,
> > Etienne
> >
> > > -Sumit
> > >
> > > > - Updated my e-mail address.
> > > > - Rephrased a bit the commit message.
> > > >
> > > > Changes since patch v8
> > > > - Patch v9 (reference below) attempted to simplify the implementation
> > > > https://lore.kernel.org/lkml/20230517143311.585080-1-sumit.garg@xxxxxxxxxx/#t
> > > >
> > > > Changes since v7:
> > > > - Changes the logic to reserve at most 1 call entry for system sessions
> > > > as per patches v6 and v7 discussion threads (the 2 below bullets)
> > > > and updates commit message accordingly.
> > > > - Field optee_call_queue::res_sys_thread_count is replaced with 2 fields:
> > > > sys_thread_req_count and boolean sys_thread_in_use.
> > > > - Field optee_call_waiter::sys_thread is replaced with 2 fields:
> > > > sys_thread_req and sys_thread_used.
> > > > - Adds inline description comments for struct optee_call_queue and
> > > > struct optee_call_waiter.
> > > >
> > > > Changes since v6:
> > > > - Moved out changes related to adding boolean system thread attribute
> > > > into optee driver call queue and SMC/FF-A ABIs API functions. These
> > > > changes were squashed into patch 1/4 of this patch v7 series.
> > > > - Comment about adding a specific commit for call queue refactoring
> > > > was not addressed such a patch would only introduce function
> > > > optee_cq_init() with very little content in (mutex & list init).
> > > > - Added Co-developed-by tag for Jens contribution as he's not responsible
> > > > for the changes I made in this patch v7.
> > > >
> > > > No change since v5
> > > >
> > > > Changes since v4:
> > > > - New change that supersedes implementation proposed in PATCH v4
> > > > (tee: system invocation"). Thanks to Jens implementation we don't need
> > > > the new OP-TEE services that my previous patch versions introduced to
> > > > monitor system threads entry. Now, Linux optee SMC ABI driver gets TEE
> > > > provisioned thread contexts count once and monitors thread entries in
> > > > OP-TEE on that basis and the system thread capability of the related
> > > > tee session. By the way, I dropped the WARN_ONCE() call I suggested
> > > > on tee thread exhaustion as it does not provides useful information.
> > > > ---
> > > > drivers/tee/optee/call.c | 128 ++++++++++++++++++++++++++++--
> > > > drivers/tee/optee/ffa_abi.c | 3 +-
> > > > drivers/tee/optee/optee_private.h | 24 +++++-
> > > > drivers/tee/optee/smc_abi.c | 16 +++-
> > > > 4 files changed, 159 insertions(+), 12 deletions(-)
> > > >
> > > > diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c
> > > > index 152ae9bb1785..38543538d77b 100644
> > > > --- a/drivers/tee/optee/call.c
> > > > +++ b/drivers/tee/optee/call.c
> > > > @@ -39,9 +39,31 @@ struct optee_shm_arg_entry {
> > > > DECLARE_BITMAP(map, MAX_ARG_COUNT_PER_ENTRY);
> > > > };
> > > >
> > > > +void optee_cq_init(struct optee_call_queue *cq, int thread_count)
> > > > +{
> > > > + mutex_init(&cq->mutex);
> > > > + INIT_LIST_HEAD(&cq->sys_waiters);
> > > > + INIT_LIST_HEAD(&cq->normal_waiters);
> > > > +
> > > > + /*
> > > > + * If cq->total_thread_count is 0 then we're not trying to keep
> > > > + * track of how many free threads we have, instead we're relying on
> > > > + * the secure world to tell us when we're out of thread and have to
> > > > + * wait for another thread to become available.
> > > > + */
> > > > + cq->total_thread_count = thread_count;
> > > > + cq->free_thread_count = thread_count;
> > > > +}
> > > > +
> > > > void optee_cq_wait_init(struct optee_call_queue *cq,
> > > > struct optee_call_waiter *w, bool sys_thread)
> > > > {
> > > > + unsigned int free_thread_threshold;
> > > > + bool need_wait = false;
> > > > +
> > > > + memset(w, 0, sizeof(*w));
> > > > + w->sys_thread = sys_thread;
> > > > +
> > > > /*
> > > > * We're preparing to make a call to secure world. In case we can't
> > > > * allocate a thread in secure world we'll end up waiting in
> > > > @@ -53,15 +75,47 @@ void optee_cq_wait_init(struct optee_call_queue *cq,
> > > > mutex_lock(&cq->mutex);
> > > >
> > > > /*
> > > > - * We add ourselves to the queue, but we don't wait. This
> > > > - * guarantees that we don't lose a completion if secure world
> > > > - * returns busy and another thread just exited and try to complete
> > > > - * someone.
> > > > + * We add ourselves to a queue, but we don't wait. This guarantees
> > > > + * that we don't lose a completion if secure world returns busy and
> > > > + * another thread just exited and try to complete someone.
> > > > */
> > > > init_completion(&w->c);
> > > > - list_add_tail(&w->list_node, &cq->waiters);
> > > > +
> > > > + if (sys_thread)
> > > > + list_add_tail(&w->list_node, &cq->sys_waiters);
> > > > + else
> > > > + list_add_tail(&w->list_node, &cq->normal_waiters);
> > > > +
> > > > + if (cq->total_thread_count) {
> > > > + if (sys_thread || !cq->sys_thread_req_count)
> > > > + free_thread_threshold = 0;
> > > > + else
> > > > + free_thread_threshold = 1;
> > > > +
> > > > + if (cq->free_thread_count > free_thread_threshold)
> > > > + cq->free_thread_count--;
> > > > + else
> > > > + need_wait = true;
> > > > + }
> > > >
> > > > mutex_unlock(&cq->mutex);
> > > > +
> > > > + while (need_wait) {
> > > > + optee_cq_wait_for_completion(cq, w);
> > > > + mutex_lock(&cq->mutex);
> > > > +
> > > > + if (sys_thread || !cq->sys_thread_req_count)
> > > > + free_thread_threshold = 0;
> > > > + else
> > > > + free_thread_threshold = 1;
> > > > +
> > > > + if (cq->free_thread_count > free_thread_threshold) {
> > > > + cq->free_thread_count--;
> > > > + need_wait = false;
> > > > + }
> > > > +
> > > > + mutex_unlock(&cq->mutex);
> > > > + }
> > > > }
> > > >
> > > > void optee_cq_wait_for_completion(struct optee_call_queue *cq,
> > > > @@ -74,7 +128,11 @@ void optee_cq_wait_for_completion(struct optee_call_queue *cq,
> > > > /* Move to end of list to get out of the way for other waiters */
> > > > list_del(&w->list_node);
> > > > reinit_completion(&w->c);
> > > > - list_add_tail(&w->list_node, &cq->waiters);
> > > > +
> > > > + if (w->sys_thread)
> > > > + list_add_tail(&w->list_node, &cq->sys_waiters);
> > > > + else
> > > > + list_add_tail(&w->list_node, &cq->normal_waiters);
> > > >
> > > > mutex_unlock(&cq->mutex);
> > > > }
> > > > @@ -83,7 +141,15 @@ static void optee_cq_complete_one(struct optee_call_queue *cq)
> > > > {
> > > > struct optee_call_waiter *w;
> > > >
> > > > - list_for_each_entry(w, &cq->waiters, list_node) {
> > > > + /* Wake waiting system session first */
> > > > + list_for_each_entry(w, &cq->sys_waiters, list_node) {
> > > > + if (!completion_done(&w->c)) {
> > > > + complete(&w->c);
> > > > + break;

I see a bug here. The loop should return straight (s/break/return/)
without attempting to wake a normal waiter.

> > > > + }
> > > > + }
> > > > +
> > > > + list_for_each_entry(w, &cq->normal_waiters, list_node) {
> > > > if (!completion_done(&w->c)) {
> > > > complete(&w->c);
> > > > break;
> > > > @@ -104,6 +170,8 @@ void optee_cq_wait_final(struct optee_call_queue *cq,
> > > > /* Get out of the list */
> > > > list_del(&w->list_node);
> > > >
> > > > + cq->free_thread_count++;
> > > > +
> > > > /* Wake up one eventual waiting task */
> > > > optee_cq_complete_one(cq);
> > > >
> > > > @@ -119,6 +187,28 @@ void optee_cq_wait_final(struct optee_call_queue *cq,
> > > > mutex_unlock(&cq->mutex);
> > > > }
> > > >
> > > > +/* Count registered system sessions to reserved a system thread or not */
> > > > +static bool optee_cq_incr_sys_thread_count(struct optee_call_queue *cq)
> > > > +{
> > > > + if (cq->total_thread_count <= 1)
> > > > + return false;
> > > > +
> > > > + mutex_lock(&cq->mutex);
> > > > + cq->sys_thread_req_count++;
> > > > + mutex_unlock(&cq->mutex);
> > > > +
> > > > + return true;
> > > > +}
> > > > +
> > > > +static void optee_cq_decr_sys_thread_count(struct optee_call_queue *cq)
> > > > +{
> > > > + mutex_lock(&cq->mutex);
> > > > + cq->sys_thread_req_count--;
> > > > + /* If there's someone waiting, let it resume */
> > > > + optee_cq_complete_one(cq);
> > > > + mutex_unlock(&cq->mutex);
> > > > +}
> > > > +
> > > > /* Requires the filpstate mutex to be held */
> > > > static struct optee_session *find_session(struct optee_context_data *ctxdata,
> > > > u32 session_id)
> > > > @@ -361,6 +451,27 @@ int optee_open_session(struct tee_context *ctx,
> > > > return rc;
> > > > }
> > > >
> > > > +int optee_system_session(struct tee_context *ctx, u32 session)
> > > > +{
> > > > + struct optee *optee = tee_get_drvdata(ctx->teedev);
> > > > + struct optee_context_data *ctxdata = ctx->data;
> > > > + struct optee_session *sess;
> > > > + int rc = -EINVAL;
> > > > +
> > > > + mutex_lock(&ctxdata->mutex);
> > > > +
> > > > + sess = find_session(ctxdata, session);
> > > > + if (sess && (sess->use_sys_thread ||
> > > > + optee_cq_incr_sys_thread_count(&optee->call_queue))) {
> > > > + sess->use_sys_thread = true;
> > > > + rc = 0;
> > > > + }
> > > > +
> > > > + mutex_unlock(&ctxdata->mutex);
> > > > +
> > > > + return rc;
> > > > +}
> > > > +
> > > > int optee_close_session_helper(struct tee_context *ctx, u32 session,
> > > > bool system_thread)
> > > > {
> > > > @@ -380,6 +491,9 @@ int optee_close_session_helper(struct tee_context *ctx, u32 session,
> > > >
> > > > optee_free_msg_arg(ctx, entry, offs);
> > > >
> > > > + if (system_thread)
> > > > + optee_cq_decr_sys_thread_count(&optee->call_queue);
> > > > +
> > > > return 0;
> > > > }
> > > >
> > > > diff --git a/drivers/tee/optee/ffa_abi.c b/drivers/tee/optee/ffa_abi.c
> > > > index 5fde9d4100e3..0c9055691343 100644
> > > > --- a/drivers/tee/optee/ffa_abi.c
> > > > +++ b/drivers/tee/optee/ffa_abi.c
> > > > @@ -852,8 +852,7 @@ static int optee_ffa_probe(struct ffa_device *ffa_dev)
> > > > if (rc)
> > > > goto err_unreg_supp_teedev;
> > > > mutex_init(&optee->ffa.mutex);
> > > > - mutex_init(&optee->call_queue.mutex);
> > > > - INIT_LIST_HEAD(&optee->call_queue.waiters);
> > > > + optee_cq_init(&optee->call_queue, 0);
> > > > optee_supp_init(&optee->supp);
> > > > optee_shm_arg_cache_init(optee, arg_cache_flags);
> > > > ffa_dev_set_drvdata(ffa_dev, optee);
> > > > diff --git a/drivers/tee/optee/optee_private.h b/drivers/tee/optee/optee_private.h
> > > > index b68273051454..69f6397c3646 100644
> > > > --- a/drivers/tee/optee/optee_private.h
> > > > +++ b/drivers/tee/optee/optee_private.h
> > > > @@ -40,15 +40,35 @@ typedef void (optee_invoke_fn)(unsigned long, unsigned long, unsigned long,
> > > > unsigned long, unsigned long,
> > > > struct arm_smccc_res *);
> > > >
> > > > +/*
> > > > + * struct optee_call_waiter - TEE entry may need to wait for a free TEE thread
> > > > + * @list_node Reference in waiters list
> > > > + * @c Waiting completion reference
> > > > + * @sys_thread_req True if waiter belongs to a system thread
> > > > + */
> > > > struct optee_call_waiter {
> > > > struct list_head list_node;
> > > > struct completion c;
> > > > + bool sys_thread;
> > > > };
> > > >
> > > > +/*
> > > > + * struct optee_call_queue - OP-TEE call queue management
> > > > + * @mutex Serializes access to this struct
> > > > + * @sys_waiters List of system threads waiting to enter OP-TEE
> > > > + * @normal_waiters List of normal threads waiting to enter OP-TEE
> > > > + * @total_thread_count Overall number of thread context in OP-TEE or 0
> > > > + * @free_thread_count Number of threads context free in OP-TEE
> > > > + * @sys_thread_req_count Number of registered system thread sessions
> > > > + */
> > > > struct optee_call_queue {
> > > > /* Serializes access to this struct */
> > > > struct mutex mutex;
> > > > - struct list_head waiters;
> > > > + struct list_head sys_waiters;
> > > > + struct list_head normal_waiters;
> > > > + int total_thread_count;
> > > > + int free_thread_count;
> > > > + int sys_thread_req_count;
> > > > };
> > > >
> > > > struct optee_notif {
> > > > @@ -254,6 +274,7 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> > > > int optee_open_session(struct tee_context *ctx,
> > > > struct tee_ioctl_open_session_arg *arg,
> > > > struct tee_param *param);
> > > > +int optee_system_session(struct tee_context *ctx, u32 session);
> > > > int optee_close_session_helper(struct tee_context *ctx, u32 session,
> > > > bool system_thread);
> > > > int optee_close_session(struct tee_context *ctx, u32 session);
> > > > @@ -303,6 +324,7 @@ static inline void optee_to_msg_param_value(struct optee_msg_param *mp,
> > > > mp->u.value.c = p->u.value.c;
> > > > }
> > > >
> > > > +void optee_cq_init(struct optee_call_queue *cq, int thread_count);
> > > > void optee_cq_wait_init(struct optee_call_queue *cq,
> > > > struct optee_call_waiter *w, bool sys_thread);
> > > > void optee_cq_wait_for_completion(struct optee_call_queue *cq,
> > > > diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c
> > > > index 1033d7da03ea..5595028d6dae 100644
> > > > --- a/drivers/tee/optee/smc_abi.c
> > > > +++ b/drivers/tee/optee/smc_abi.c
> > > > @@ -1211,6 +1211,7 @@ static const struct tee_driver_ops optee_clnt_ops = {
> > > > .release = optee_release,
> > > > .open_session = optee_open_session,
> > > > .close_session = optee_close_session,
> > > > + .system_session = optee_system_session,
> > > > .invoke_func = optee_invoke_func,
> > > > .cancel_req = optee_cancel_req,
> > > > .shm_register = optee_shm_register,
> > > > @@ -1358,6 +1359,16 @@ static bool optee_msg_exchange_capabilities(optee_invoke_fn *invoke_fn,
> > > > return true;
> > > > }
> > > >
> > > > +static unsigned int optee_msg_get_thread_count(optee_invoke_fn *invoke_fn)
> > > > +{
> > > > + struct arm_smccc_res res;
> > > > +
> > > > + invoke_fn(OPTEE_SMC_GET_THREAD_COUNT, 0, 0, 0, 0, 0, 0, 0, &res);
> > > > + if (res.a0)
> > > > + return 0;
> > > > + return res.a1;
> > > > +}
> > > > +
> > > > static struct tee_shm_pool *
> > > > optee_config_shm_memremap(optee_invoke_fn *invoke_fn, void **memremaped_shm)
> > > > {
> > > > @@ -1610,6 +1621,7 @@ static int optee_probe(struct platform_device *pdev)
> > > > struct optee *optee = NULL;
> > > > void *memremaped_shm = NULL;
> > > > unsigned int rpc_param_count;
> > > > + unsigned int thread_count;
> > > > struct tee_device *teedev;
> > > > struct tee_context *ctx;
> > > > u32 max_notif_value;
> > > > @@ -1637,6 +1649,7 @@ static int optee_probe(struct platform_device *pdev)
> > > > return -EINVAL;
> > > > }
> > > >
> > > > + thread_count = optee_msg_get_thread_count(invoke_fn);
> > > > if (!optee_msg_exchange_capabilities(invoke_fn, &sec_caps,
> > > > &max_notif_value,
> > > > &rpc_param_count)) {
> > > > @@ -1726,8 +1739,7 @@ static int optee_probe(struct platform_device *pdev)
> > > > if (rc)
> > > > goto err_unreg_supp_teedev;
> > > >
> > > > - mutex_init(&optee->call_queue.mutex);
> > > > - INIT_LIST_HEAD(&optee->call_queue.waiters);
> > > > + optee_cq_init(&optee->call_queue, thread_count);
> > > > optee_supp_init(&optee->supp);
> > > > optee->smc.memremaped_shm = memremaped_shm;
> > > > optee->pool = pool;
> > > > --
> > > > 2.25.1
> > > >
> > >
>