Re: [PATCH v3 0/9] Introduce a unified API for SCMI Server testing

From: Cristian Marussi
Date: Fri Oct 07 2022 - 11:37:56 EST


On Fri, Oct 07, 2022 at 04:23:33PM +0200, Vincent Guittot wrote:
> Hi Cristian,
>

Hi Vincent

thanks for give it a try !

> On Sat, 3 Sept 2022 at 20:31, Cristian Marussi <cristian.marussi@xxxxxxx> wrote:
> >
> > Hi all,
> >
> > This series aims to introduce a new SCMI unified userspace interface meant
> > to ease testing an SCMI Server implementation for compliance, fuzzing etc.,
> > from the perspective of the OSPM agent (non-secure world only ...)
> >
> > It is proposed as a testing/development facility, it is NOT meant to be a
> > feature to use in production, but only enabled in Kconfig for test
> > deployments.
> >
> > Currently an SCMI Compliance Suite like the one at [1] can only work by
> > injecting SCMI messages at the SCMI transport layer using the mailbox test
> > driver (CONFIG_MAILBOX_TEST) via its few debugfs entries and looking at
> > the related replies from the SCMI backend Server.
> >
>
> ...
>
> >
> > In V2 the runtime enable/disable switching capability has been removed
> > (for now) since still not deemed to be stable/reliable enough: as a
> > consequence when SCMI Raw support is compiled in, the regular SCMI stack
> > drivers are now inhibited permanently for that Kernel.
> >
> > A quick and trivial example from the shell...reading from a sensor
> > injecting a properly crafted packet in raw mode:
> >
> > # INJECT THE SENSOR_READING MESSAGE FOR SENSOR ID=1 (binary little endian)
> > root@deb-buster-arm64:~# echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00 > /sys/kernel/debug/scmi_raw/message
>
> I have tried your patchset with an SCMI server using an optee-os
> transport channel but I have a timed out error when trying your
> example above to read sensor1
>
> # echo -e -n \\x06\\x54\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00
> > /sys/kernel/debug/scmi_raw/message
> # [ 93.306690] arm-scmi firmware:scmi0: timed out in RAW response -
> HDR:00005406
>
> and there no response available when trying to read it with
> # cat /sys/kernel/debug/scmi_raw/message
>

is there anything cat'ting /sys/kernel/debug/scmi_raw/errors ?

>
> The sensor 1 can be successfully read in normal mode:
> # cat /sys/class/hwmon/hwmon0/temp1_input
> 25000
> #
>
> In both case, the SCMI server received the requests and replied successfully
>
> Could it be that you process the answer differently in case of raw mode ?
>

Well, absolutely, when in raw mode the reply is picked up directly into
the RX path and copied in a message queue to be read from asyncrhnously
later via debugfs.

... mmm I think I found the problem...the reply is picked up on the RX *IRQ*
path as of now...but in optee/SMC there is no interrupt (sometime there is in
SMC) and the reply is instead read back straight away (transport is marked as
sync_cmds_completed_on_ret=true in fact).... so this is the issue probably ...
I have NOT tested on anything but mailbox and virtio till now...and I
missed this possibility that this NO-irq reply was a thing even when NOT
in polling mode (which I do not support)...my bad :<

Ok, next week I'll rework the series to fix this big_BUG and some other minor
things...in the meantime if you want to try this snippet down below...

... this won't definitely be the final patch but it could let you experiment
more (only build tested though )

Thanks for testing !
Cristian

--->8-------

diff --git a/drivers/firmware/arm_scmi/raw_mode.c b/drivers/firmware/arm_scmi/raw_mode.c
index 13eeebe4b7a8..b9fcb66a1b6a 100644
--- a/drivers/firmware/arm_scmi/raw_mode.c
+++ b/drivers/firmware/arm_scmi/raw_mode.c
@@ -197,6 +197,8 @@ struct scmi_dbg_raw_data {
size_t rx_size;
};

+void scmi_raw_message_report(void *r, struct scmi_xfer *xfer, unsigned int idx);
+
static inline
struct scmi_raw_buffer *scmi_raw_buffer_get(struct scmi_raw_mode_info *raw,
unsigned int idx)
@@ -389,22 +391,34 @@ static void scmi_xfer_raw_worker(struct work_struct *work)

xfer = rw->xfer;

- /*
- * Waiters are queued by wait-deadline at the end, so some of
- * them could have been already expired when processed, BUT we
- * have to check the completion status anyway just in case a
- * virtually expired (aged) transaction was indeed completed
- * fine and we'll have to wait for the asynchronous part (if
- * any).
- */
- aging = jiffies - rw->start_jiffies;
- tmo = max_tmo > aging ? max_tmo - aging : 0;
-
- if ((tmo && !wait_for_completion_timeout(&xfer->done, tmo)) ||
- (!tmo && !try_wait_for_completion(&xfer->done))) {
- dev_err(dev, "timed out in RAW response - HDR:%08X\n",
- pack_scmi_header(&xfer->hdr));
- ret = -ETIMEDOUT;
+ if (!raw->desc->sync_cmds_completed_on_ret) {
+ /*
+ * Waiters are queued by wait-deadline at the end, so some of
+ * them could have been already expired when processed, BUT we
+ * have to check the completion status anyway just in case a
+ * virtually expired (aged) transaction was indeed completed
+ * fine and we'll have to wait for the asynchronous part (if
+ * any).
+ */
+ aging = jiffies - rw->start_jiffies;
+ tmo = max_tmo > aging ? max_tmo - aging : 0;
+
+ if ((tmo &&
+ !wait_for_completion_timeout(&xfer->done, tmo)) ||
+ (!tmo && !try_wait_for_completion(&xfer->done))) {
+ dev_err(dev,
+ "timed out in RAW response - HDR:%08X\n",
+ pack_scmi_header(&xfer->hdr));
+ ret = -ETIMEDOUT;
+ }
+ } else {
+ raw->desc->ops->fetch_response(rw->cinfo, xfer);
+ /* Trace polled replies. */
+ trace_scmi_msg_dump(xfer->hdr.protocol_id, xfer->hdr.id,
+ "RESP",
+ xfer->hdr.seq, xfer->hdr.status,
+ xfer->rx.buf, xfer->rx.len);
+ scmi_raw_message_report(raw, xfer, SCMI_RAW_REPLY_QUEUE);
}

/* Avoid unneeded async waits */


---8<-------