Re: [PATCH] firmware: arm_scmi: Give SMC transport precedence over mailbox
From: Sudeep Holla
Date: Tue Oct 08 2024 - 09:06:30 EST
Hi Florian,
Thanks for the detailed explanation.
On Mon, Oct 07, 2024 at 10:07:46AM -0700, Florian Fainelli wrote:
> Hi Cristian,
>
> On October 7, 2024 4:52:33 AM PDT, Cristian Marussi
> <cristian.marussi@xxxxxxx> wrote:
> > On Sat, Oct 05, 2024 at 09:33:17PM -0700, Florian Fainelli wrote:
> > > Broadcom STB platforms have for historical reasons included both
> > > "arm,scmi-smc" and "arm,scmi" in their SCMI Device Tree node compatible
> > > string.
> >
> > Hi Florian,
> >
> > did not know this..
>
> It stems from us starting with a mailbox driver that did the SMC call, and
> later transitioning to the "smc" transport proper. Our boot loader provides
> the Device Tree blob to the kernel and we maintain backward/forward
> compatibility as much as possible.
>
IIUC, you need to support old kernel with SMC mailbox driver and new SMC
transport within the SCMI. Is that right understanding ?
> >
> > >
> > > After the commit cited in the Fixes tag and with a kernel
> > > configuration that enables both the SCMI and the Mailbox transports, we
> > > would probe the mailbox transport, but fail to complete since we would
> > > not have a mailbox driver available.
> > >
> > Not sure to have understood this...
> >
> > ...you mean you DO have the SMC/Mailbox SCMI transport drivers compiled
> > into the Kconfig AND you have BOTH the SMC AND Mailbox compatibles in
> > DT, BUT your platform does NOT physically have a mbox/shmem transport
> > and as a consequence, when MBOX probes (at first), you see an error from
> > the core like:
> >
> > "arm-scmi: unable to communicate with SCMI"
> >
> > since it gets no reply from the SCMI server (being not connnected via
> > mbox) and it bails out .... am I right ?
>
> In an unmodified kernel where both the "mailbox" and "smc" transports are
> enabled, we get the "mailbox" driver to probe first since it matched the
> "arm,scmi" part of the compatible string and it is linked first into the
> kernel. Down the road though we will fail the initialization with:
>
> [ 1.135363] arm-scmi arm-scmi.1.auto: Using scmi_mailbox_transport
> [ 1.141901] arm-scmi arm-scmi.1.auto: SCMI max-rx-timeout: 30ms
> [ 1.148113] arm-scmi arm-scmi.1.auto: failed to setup channel for
> protocol:0x10
IIUC, the DTB has mailbox nodes that are available but fail only in the setup
stage ? Or is it marked unavailable and we are missing some checks either
in SCMI or mailbox ?
IOW, have you already explored that this -EINVAL is correct return value
here and can't be changed to -ENODEV ? I might be not following the failure
path correctly here, but I assume it is
scmi_chan_setup()
info->desc->ops->chan_setup()
mailbox_chan_setup()
mbox_request_channel()
> [ 1.155828] arm-scmi arm-scmi.1.auto: error -EINVAL: failed to setup
> channels
> [ 1.163379] arm-scmi arm-scmi.1.auto: probe with driver arm-scmi failed
> with error -22
>
> Because the platform device is now bound, and there is no mechanism to
> return -ENODEV, we won't try another transport driver that would attempt to
> match the other compatibility strings. That makes sense because in general
> you specify the Device Tree precisely, and you also have a tailored kernel
> configuration. Right now this is only an issue using arm's
> multi_v7_defconfig and arm64's defconfig both of which that we intend to
> keep on using for CI purposes.
>
>
> >
> > If this is the case, without this patch, after this error and the mbox probe
> > failing, the SMC transport, instead, DO probe successfully at the end, right ?
>
> With my patch we probe the "smc" transport first and foremost and we
> successfully initialize it, therefore we do not even try the "mailbox"
> transport at all, which is intended.
>
> >
> > IOW, what is the impact without this patch, an error and a delay in the
> > probe sequence till it gets to the SMC transport probe 9as second
> > attempt) or worse ? (trying to understand here...)
>
> There is no recovery without the patch, we are not giving up the arm_scmi
> platform device because there is no mechanism to return -ENODEV and allow
> any of the subsequent transport drivers enabled to attempt to take over the
> platform device and probe it again.
>
OK this sounds like you have already explored returning -ENODEV is not
an option ? It is fair enough, but just want to understand correctly.
I still think I am missing something.
I understand the bootloader maintaining backward compatibility, but
just want to understand better. I also wonder if the old SMC mailbox driver
returns -EINVAL instead of -ENODEV ? Again it is based on my assumption
about your backward compatibility usecase.
--
Regards,
Sudeep