Re: [PATCH 3/3] mailbox: apple: Add driver for Apple mailboxes
From: Sven Peter
Date: Wed Sep 08 2021 - 11:39:01 EST
On Tue, Sep 7, 2021, at 23:06, Alyssa Rosenzweig wrote:
> > > > + * Both the main CPU and the co-processor see the same set of registers but
> > > > + * the first FIFO (A2I) is always used to transfer messages from the application
> > > > + * processor (us) to the I/O processor and the second one (I2A) for the
> > > > + * other direction.
> > >
> > > Do we actually know what the copro sees? It seems obvious, but *know*?
> >
> > Yes, I know. They see the exact same set of registers. I wouldn't have written
> > it like that otherwise.
>
> Ack.
>
> > > > +struct apple_mbox {
> > > > + void __iomem *regs;
> > > > + const struct apple_mbox_hw *hw;
> > > > + ...
> > > > +};
> > >
> > > This requires a lot of pointer chasing to send/receive messages. Will
> > > this become a perf issue in any case? It'd be good to get ballparks on
> > > how often these mboxes are used. (For DCP, it doesn't matter, only a few
> > > hundred messages per second. Other copros may have different behaviour)
> >
> > If this actually becomes a problem I'm happy to fix it but right now
> > this feels like premature optimization to me. DCP is going to be the
> > worst offender followed by SMC (at most a few per second when it's really
> > busy during boot time) and SEP (I've also never seen more than a few per
> > second here). The rest of them are mostly quiet after they have booted.
>
> Good enough for me -- it won't matter for DCP, so if it doesn't get any
> worse than DCP there's no point in optimizing this.
>
> > > > +static bool apple_mbox_hw_can_send(struct apple_mbox *apple_mbox)
> > > > +{
> > > > + u32 mbox_ctrl =
> > > > + readl_relaxed(apple_mbox->regs + apple_mbox->hw->a2i_control);
> > > > +
> > > > + return !(mbox_ctrl & apple_mbox->hw->control_full);
> > > > +}
> > >
> > > If you do the pointer chasing, I suspect you want accessor
> > > functions/macros at least to make it less intrusive...
> >
> > There's regmap but that can't easily be used here because I need 32bit
> > and 64bit writes. I also don't really see the advantage of replacing
> > this with some custom functions like e.g.
> >
> > mbox_ctrl = apple_mbox_hw_readl(apple_mbox, apple_mbox->hw->a2i_control);
> >
> > which is almost as long. And if I introduce a separate function just to read the
> > control reg this just becomes a lot of boilerplate and harder to follow.
> >
> > Or did you have anything else in mind?
>
> I was envisioning a macro:
>
> > #define apple_mbox_readl(mbox, name) \
> > readl_relaxed(mbox->regs + mbox->hw-> ## name)
> >
> > mbox_ctrl = apple_mbox_readl(apple_mbox, a2i_control);
>
> Now that I've typed it out, however, it seems a bit too magical to
> justify the minor space savings. And given you need both 32b and 64b
> there would be ~4 such macros which is also a lot of boilerplate.
>
> It's not a huge deal either way but I thought I'd raise it.
Yeah, I've thought about this as well but this entire hardware is
so simple that we don't gain much from it imho.
>
> > > > + dev_dbg(apple_mbox->dev, "> TX %016llx %08x\n", msg->msg0, msg->msg1);
> > >
> > > Isn't "TX" redundant here?
> >
> > Sure, but it also doesn't hurt in a debug message. I can spot the TX easier
> > but I'm sure there are people who prefer >.
>
> Fair enough.
>
> > > > + dev_dbg(apple_mbox->dev, "got FIFO empty IRQ\n");
> > >
> > > I realize it's a dev_dbg but this still seems unnecessarily noisy.
> >
> > This code path is almost never reached. I've only been able to trigger
> > it by forcing send_message to return EBUSY even when it could still
> > move messages to the FIFO to test this path.
> > This also can't be triggered more often than the TX debug message.
> > If this triggers more often there's a bug somewhere that kept the interrupt
> > enabled and then the whole machine will hang due an interrupt storm anyway. In
> > that case I'd prefer to have this as noisy as possible.
>
> Ah! Sure, makes sense. Is it worth adding a few words to the functions
> comments indicating this rarely occurs in good conditions?
Sure, I can add a small comment if it makes the code easier to understand.
>
> > > > +static irqreturn_t apple_mbox_recv_irq(int irq, void *data)
> > > > +{
> > > > + struct apple_mbox *apple_mbox = data;
> > > > + struct apple_mbox_msg msg;
> > > > +
> > > > + while (apple_mbox_hw_can_recv(apple_mbox)) {
> > > > + apple_mbox_hw_recv(apple_mbox, &msg);
> > > > + mbox_chan_received_data(&apple_mbox->chan, (void *)&msg);
> > > > + }
> > > ```
> > >
> > > A comment is warranted why this loop is safe and will always terminate,
> > > especially given this is an IRQ context. (Ah... can a malicious
> > > coprocessor DoS the AP by sending messages faster than the AP can
> > > process them? freezing the system since preemption might be disabled
> > > here? I suppose thee's no good way to protect against that level of
> > > goofy.)
> >
> > The only way this can't terminate is if the co-processor tries to stall
> > us with messages, but what's your threat model here? If the co-processor wants
> > to be evil it usually has many other ways to annoy us (e.g. ANS could just disable
> > NVMe, SMC could just trigger a shutdown, etc.)
> >
> > I could move this to threaded interrupt context though which would make that a bit
> > harder to pull off at the cost of a bit more latency from incoming messages.
> > Not sure that's worth it though.
>
> Probably not worth it, no.
One small advantage of the threaded interrupt would be that the mailbox clients could
detect the invalid messages and at least get a chance to shut the channel down if
the co-processor did this by accident.
Still not sure if that would actually help much but I guess it won't matter in
the end anyway. Changing this only requires to request a threaded irq in the probe
function so it's also not a big deal either way.
>
> > >
> > > > + * There's no race if a message comes in between the check in the while
> > > > + * loop above and the ack below: If a new messages arrives inbetween
> > > > + * those two the interrupt will just fire again immediately after the
> > > > + * ack since it's level sensitive.
> > >
> > > I don't quite understand the reasoning. Why have IRQ controls at all
> > > then on the M3?
> >
> > Re-read the two lines just above this one. If the interrupt is not acked here
> > it will keep firing even when there are no new messages.
> > But it's fine to ack it even when there are message available since it will
> > just trigger again immediately.
>
> Got it, thanks.
>
> > > > + /*
> > > > + * Only some variants of this mailbox HW provide interrupt control
> > > > + * at the mailbox level. We therefore need to handle enabling/disabling
> > > > + * interrupts at the main interrupt controller anyway for hardware that
> > > > + * doesn't. Just always keep the interrupts we care about enabled at
> > > > + * the mailbox level so that both hardware revisions behave almost
> > > > + * the same.
> > > > + */
> > >
> > > Cute. Does macOS do this? Are there any tradeoffs?
> >
> > Not sure if macOS does is and I also don't see a reason to care what it
> > does exactly. I've verified that this works with the M3 mailboxes.
> > I also don't see why there would there be any tradeoffs.
> > Do you have anything specific in mind?
> >
> > I suspect this HW was used in previous SoCs where all four interrupts
> > shared a single line and required this chained interrupt controller
> > at the mailbox level. But since they are all separate on the M1 it's
> > just a nuisance we have to deal with - especially since the ASC
> > variant got rid of the interrupt controls.
>
> Makes sense for a legacy block 👍
>
Best,
Sven