Hi Florian,
Thanks for the detailed explanation.
On Mon, Oct 07, 2024 at 10:07:46AM -0700, Florian Fainelli wrote:
Hi Cristian,
On October 7, 2024 4:52:33 AM PDT, Cristian Marussi
<cristian.marussi@xxxxxxx> wrote:
On Sat, Oct 05, 2024 at 09:33:17PM -0700, Florian Fainelli wrote:
Broadcom STB platforms have for historical reasons included both
"arm,scmi-smc" and "arm,scmi" in their SCMI Device Tree node compatible
string.
Hi Florian,
did not know this..
It stems from us starting with a mailbox driver that did the SMC call, and
later transitioning to the "smc" transport proper. Our boot loader provides
the Device Tree blob to the kernel and we maintain backward/forward
compatibility as much as possible.
IIUC, you need to support old kernel with SMC mailbox driver and new SMC
transport within the SCMI. Is that right understanding ?
Not sure to have understood this...
After the commit cited in the Fixes tag and with a kernel
configuration that enables both the SCMI and the Mailbox transports, we
would probe the mailbox transport, but fail to complete since we would
not have a mailbox driver available.
...you mean you DO have the SMC/Mailbox SCMI transport drivers compiled
into the Kconfig AND you have BOTH the SMC AND Mailbox compatibles in
DT, BUT your platform does NOT physically have a mbox/shmem transport
and as a consequence, when MBOX probes (at first), you see an error from
the core like:
"arm-scmi: unable to communicate with SCMI"
since it gets no reply from the SCMI server (being not connnected via
mbox) and it bails out .... am I right ?
In an unmodified kernel where both the "mailbox" and "smc" transports are
enabled, we get the "mailbox" driver to probe first since it matched the
"arm,scmi" part of the compatible string and it is linked first into the
kernel. Down the road though we will fail the initialization with:
[ 1.135363] arm-scmi arm-scmi.1.auto: Using scmi_mailbox_transport
[ 1.141901] arm-scmi arm-scmi.1.auto: SCMI max-rx-timeout: 30ms
[ 1.148113] arm-scmi arm-scmi.1.auto: failed to setup channel for
protocol:0x10
IIUC, the DTB has mailbox nodes that are available but fail only in the setup
stage ? Or is it marked unavailable and we are missing some checks either
in SCMI or mailbox ?
IOW, have you already explored that this -EINVAL is correct return value
here and can't be changed to -ENODEV ? I might be not following the failure
path correctly here, but I assume it is
scmi_chan_setup()
info->desc->ops->chan_setup()
mailbox_chan_setup()
mbox_request_channel()
[ 1.155828] arm-scmi arm-scmi.1.auto: error -EINVAL: failed to setup
channels
[ 1.163379] arm-scmi arm-scmi.1.auto: probe with driver arm-scmi failed
with error -22
Because the platform device is now bound, and there is no mechanism to
return -ENODEV, we won't try another transport driver that would attempt to
match the other compatibility strings. That makes sense because in general
you specify the Device Tree precisely, and you also have a tailored kernel
configuration. Right now this is only an issue using arm's
multi_v7_defconfig and arm64's defconfig both of which that we intend to
keep on using for CI purposes.
If this is the case, without this patch, after this error and the mbox probe
failing, the SMC transport, instead, DO probe successfully at the end, right ?
With my patch we probe the "smc" transport first and foremost and we
successfully initialize it, therefore we do not even try the "mailbox"
transport at all, which is intended.
IOW, what is the impact without this patch, an error and a delay in the
probe sequence till it gets to the SMC transport probe 9as second
attempt) or worse ? (trying to understand here...)
There is no recovery without the patch, we are not giving up the arm_scmi
platform device because there is no mechanism to return -ENODEV and allow
any of the subsequent transport drivers enabled to attempt to take over the
platform device and probe it again.
OK this sounds like you have already explored returning -ENODEV is not
an option ? It is fair enough, but just want to understand correctly.
I still think I am missing something.
I understand the bootloader maintaining backward compatibility, but
just want to understand better. I also wonder if the old SMC mailbox driver
returns -EINVAL instead of -ENODEV ? Again it is based on my assumption
about your backward compatibility usecase.