Re: [RFC] dt-bindings: mailbox: add doorbell support to ARM MHU

From: Sudeep Holla
Date: Wed Sep 09 2020 - 05:31:27 EST

On Tue, Sep 08, 2020 at 10:23:33PM -0500, Jassi Brar wrote:
> On Tue, Sep 8, 2020 at 4:15 AM Arnd Bergmann <arnd@xxxxxxxx> wrote:
> >
> > Picking up the old thread again after and getting pinged by multiple
> > colleagues about it (thanks!) reading through the history.
> >
> > On Fri, Jun 12, 2020 at 7:29 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> > >
> > > On 11-06-20, 19:34, Jassi Brar wrote:
> > > > In the first post in this thread, Viresh lamented that mailbox
> > > > introduces "a few ms" delay in the scheduler path.
> > > > Your own tests show that is certainly not the case -- average is the
> > > > same as proposed virtual channels 50-100us, the best case is 3us vs
> > > > 53us for virtual channels.
> > >
> > > Hmmm, I am not sure where is the confusion here Jassi. There are two
> > > things which are very very different from each other.
> > >
> > > - Time taken by the mailbox framework (and remote for acknowledging
> > > it) for completion of a single request, this can be 3us to 100s of
> > > us. This is clear for everyone. THIS IS NOT THE PROBLEM.
> > >
> > > - Delay introduced by few of such requests on the last one, i.e. 5
> > > normal requests followed by an important one (like DVFS), the last
> > > one needs to wait for the first 5 to finish first. THIS IS THE
> > > PROBLEM.
> >
> > Earlier, Jassi also commented "Linux does not provide real-time
> > guarantees", which to me is what actually causes the issue here:
> >
> > Linux having timeouts when communicating to the firmware means
> > that it relies on the hardware and firmware having real-time behavior
> > even when not providing real-time guarantees to its processes.
> >
> The timeout used in SCMI is simply based on how long the Juno (?)
> platform takes to reply in most cases.

Just FYI, the timeouts in SCMI can be platform specific. So each platform
have flexibility to choose it's own choice of timeout and need not be
stuck with so called *Juno values".

The architects of SCMI believe the transfers(especially DVFS) must not exceed
few 100s of uS and worst case for any transfers must be few ms. My initial
choice of 30ms was based on the jiffes based timer and 100Hz. Architect claim
that is too much, but I thought 3 jiffies at minimum in case we start timer
when we are close to the boundaries.

> Talking proper code-design, the timeout (if at all) shouldn't even be
> a hardcoded value, but instead taken from the platform.
> > When comparing the two usage models, it's clear that the minimum
> > latency for a message delivery is always at least the time time
> > to process an interrupt, plus at least one expensive MMIO read
> > and one less expensive posted MMIO write for an Ack. If we
> > have a doorbell plus out-of-band message, we need an extra
> > DMA barrier and a read from coherent memory, both of which can
> > be noticeable. As soon as messages are queued in the current
> > model, the maximum latency increases by a potentially unbounded
> > number of round-trips, while in the doorbell model that problem
> > does not exist, so I agree that we need to handle both modes
> > in the kernel deal with all existing hardware as well as firmware
> > that requires low-latency communication.
> >
> From the test case Sudeep last shared, the scmi usage on mhu doesn't
> not even hit any bottleneck ... the test "failed" because of the too
> small hardcoded timeout value. Otherwise the current code actually
> shows better numbers.

No disagreement on the latter part. But we can't ignore the bottlenecks
even if they are rare.

> We need some synthetic tests to bring the limitation to the surface. I
> agree that there may be such a test case, however fictitious. For that
> reason, I am ok with the doorbell mode.

Thanks !