Re: [PATCH v2 4/5] firmware: qcom: scm: Add wait-queue helper functions

From: Rajendra Nayak
Date: Thu Aug 11 2022 - 01:44:28 EST



On 8/11/2022 8:30 AM, Guru Das Srinagesh wrote:
Hey Rajendra,

Sorry for the delay in response. Needed to clarify with internal team members

no worries,

on these questions before responding.

On Aug 02 2022 17:07, Rajendra Nayak wrote:

On 7/23/2022 4:07 AM, Guru Das Srinagesh wrote:
When the firmware (FW) supports multiple requests per VM, and the VM
also supports it via the `allow-multi-call` device tree flag, the
floodgates are thrown open for them to all reach the firmware at the
same time.

[...]

2) SCM_WAITQ_WAKE:

When an SCM call receives this return value instead of success
or error, FW wishes to signal HLOS to wake up a (different)
previously sleeping call.

FW tells HLOS which call to wake up via the additional return
values `wq_ctx`, `smc_call_ctx` and `flags`. The first two have
already been explained above.

`flags` can be either WAKE_ONE or WAKE_ALL. Meaning, wake either
one, or all, of the SCM calls that HLOS is associating with the
given `wq_ctx`.

A sleeping SCM call can be woken up by either an interrupt that FW
raises, or via a SCM_WAITQ_WAKE return value for a new SCM call.

Do you know why the FW was not designed to always use an interrupt?
That would have made the handling of this in kernel a lot less complicated.

Because:

1. Our firmware in TrustZone cannot raise interrupts on its own - it needs the
hypervisor to do that.

2. Thus, in platforms where there is no hypervisor, there is no interrupt
possible - only SMC_WAITQ_WAKE.

Therefore, relying only on an interrupt would render the driver unable to
support platforms without a hypervisor, which we didn't want to do.

Thanks Guru for the clarification, however what problem are we really solving
with this on platforms _without_ a hypervisor?

Your cover letter said
'The problem this feature is fixing is as follows. In a scenario where there is
a VM in addition to HLOS (and an underlying hypervisor):'

So I assumed this was primarily for platforms _with_ a VM/Hypervisor?

I understand that even with just the HLOS and no VM, if we can get these requests
processed concurrently it still adds value, but eventually Trustzone will
still process these requests sequentially right?

The handshake mechanism that HLOS uses to talk to FW about wait-queue
operations involves three new SMC calls. These are:


[...]

+static void scm_irq_work(struct work_struct *work)
+{
+ int ret;
+ u32 wq_ctx, flags, more_pending = 0;
+ struct completion *wq_to_wake;
+ struct qcom_scm_waitq *w = container_of(work, struct qcom_scm_waitq, scm_irq_work);
+ struct qcom_scm *scm = container_of(w, struct qcom_scm, waitq);
+
+ do {
+ ret = scm_get_wq_ctx(&wq_ctx, &flags, &more_pending);
+ if (ret) {
+ pr_err("GET_WQ_CTX SMC call failed: %d\n", ret);
+ return;
+ }
+
+ wq_to_wake = qcom_scm_lookup_wq(scm, wq_ctx);
+ if (IS_ERR_OR_NULL(wq_to_wake)) {
+ pr_err("No waitqueue found for wq_ctx %d: %ld\n",
+ wq_ctx, PTR_ERR(wq_to_wake));
+ return;

What happens if at this point 'more_pending' was true? will the FW raise
another interrupt?

Hmm. At this point, the interrupt handler is early-exiting without waking up a
sleeping call via the flag_handler() because firmware has goofed up and given
it an invalid wq_ctx. We have bigger problems than `more_pending` being true.


+ }
+
+ scm_waitq_flag_handler(wq_to_wake, flags);
+ } while (more_pending);
+}

Thank you.

Guru Das.