Re: [PATCH] firmware: arm_scmi: Give SMC transport precedence over mailbox

From: Cristian Marussi
Date: Tue Oct 08 2024 - 08:28:10 EST


On Mon, Oct 07, 2024 at 10:07:46AM -0700, Florian Fainelli wrote:
> Hi Cristian,
>
> On October 7, 2024 4:52:33 AM PDT, Cristian Marussi
> <cristian.marussi@xxxxxxx> wrote:
> > On Sat, Oct 05, 2024 at 09:33:17PM -0700, Florian Fainelli wrote:
> > > Broadcom STB platforms have for historical reasons included both
> > > "arm,scmi-smc" and "arm,scmi" in their SCMI Device Tree node compatible
> > > string.
> >
> > Hi Florian,
> >
> > did not know this..
>
> It stems from us starting with a mailbox driver that did the SMC call, and
> later transitioning to the "smc" transport proper. Our boot loader provides
> the Device Tree blob to the kernel and we maintain backward/forward
> compatibility as much as possible.
>

OK.

> >
> > >
> > > After the commit cited in the Fixes tag and with a kernel
> > > configuration that enables both the SCMI and the Mailbox transports, we
> > > would probe the mailbox transport, but fail to complete since we would
> > > not have a mailbox driver available.
> > >
> > Not sure to have understood this...
> >
> > ...you mean you DO have the SMC/Mailbox SCMI transport drivers compiled
> > into the Kconfig AND you have BOTH the SMC AND Mailbox compatibles in
> > DT, BUT your platform does NOT physically have a mbox/shmem transport
> > and as a consequence, when MBOX probes (at first), you see an error from
> > the core like:
> >
> > "arm-scmi: unable to communicate with SCMI"
> >
> > since it gets no reply from the SCMI server (being not connnected via
> > mbox) and it bails out .... am I right ?
>
> In an unmodified kernel where both the "mailbox" and "smc" transports are
> enabled, we get the "mailbox" driver to probe first since it matched the
> "arm,scmi" part of the compatible string and it is linked first into the
> kernel. Down the road though we will fail the initialization with:
>
> [ 1.135363] arm-scmi arm-scmi.1.auto: Using scmi_mailbox_transport
> [ 1.141901] arm-scmi arm-scmi.1.auto: SCMI max-rx-timeout: 30ms
> [ 1.148113] arm-scmi arm-scmi.1.auto: failed to setup channel for
> protocol:0x10
> [ 1.155828] arm-scmi arm-scmi.1.auto: error -EINVAL: failed to setup
> channels
> [ 1.163379] arm-scmi arm-scmi.1.auto: probe with driver arm-scmi failed
> with error -22
>
> Because the platform device is now bound, and there is no mechanism to
> return -ENODEV, we won't try another transport driver that would attempt to
> match the other compatibility strings. That makes sense because in general
> you specify the Device Tree precisely, and you also have a tailored kernel
> configuration. Right now this is only an issue using arm's
> multi_v7_defconfig and arm64's defconfig both of which that we intend to
> keep on using for CI purposes.
>

Ah ok so the issue derives from the fact that you have a single
compatible line with 2 not compatbles that are not really "compatible"
from the SCMI core point of view...

...also I suppose that if we "somehow" would trigger a
device_release_drievr(), what will happen is that it will match probably
again in the same order at the next attempt (beside being an ugly thing)

>
> >
> > If this is the case, without this patch, after this error and the mbox probe
> > failing, the SMC transport, instead, DO probe successfully at the end, right ?
>
> With my patch we probe the "smc" transport first and foremost and we
> successfully initialize it, therefore we do not even try the "mailbox"
> transport at all, which is intended.
>
> >
> > IOW, what is the impact without this patch, an error and a delay in the
> > probe sequence till it gets to the SMC transport probe 9as second
> > attempt) or worse ? (trying to understand here...)
>
> There is no recovery without the patch, we are not giving up the arm_scmi
> platform device because there is no mechanism to return -ENODEV and allow
> any of the subsequent transport drivers enabled to attempt to take over the
> platform device and probe it again.
>

Ok...so it is a workaround hack indeed....but it seems NOT to have bad
side effects and there is definitely no cleaner way to make it bind
properly...beside fixing your DTs for the future...

Thanks,
Cristian