Re: [PATCH 3/3] mailbox: apple: Add driver for Apple mailboxes

From: Alyssa Rosenzweig
Date: Tue Sep 07 2021 - 17:07:03 EST


> > > + * Both the main CPU and the co-processor see the same set of registers but
> > > + * the first FIFO (A2I) is always used to transfer messages from the application
> > > + * processor (us) to the I/O processor and the second one (I2A) for the
> > > + * other direction.
> >
> > Do we actually know what the copro sees? It seems obvious, but *know*?
>
> Yes, I know. They see the exact same set of registers. I wouldn't have written
> it like that otherwise.

Ack.

> > > +struct apple_mbox {
> > > + void __iomem *regs;
> > > + const struct apple_mbox_hw *hw;
> > > + ...
> > > +};
> >
> > This requires a lot of pointer chasing to send/receive messages. Will
> > this become a perf issue in any case? It'd be good to get ballparks on
> > how often these mboxes are used. (For DCP, it doesn't matter, only a few
> > hundred messages per second. Other copros may have different behaviour)
>
> If this actually becomes a problem I'm happy to fix it but right now
> this feels like premature optimization to me. DCP is going to be the
> worst offender followed by SMC (at most a few per second when it's really
> busy during boot time) and SEP (I've also never seen more than a few per
> second here). The rest of them are mostly quiet after they have booted.

Good enough for me -- it won't matter for DCP, so if it doesn't get any
worse than DCP there's no point in optimizing this.

> > > +static bool apple_mbox_hw_can_send(struct apple_mbox *apple_mbox)
> > > +{
> > > + u32 mbox_ctrl =
> > > + readl_relaxed(apple_mbox->regs + apple_mbox->hw->a2i_control);
> > > +
> > > + return !(mbox_ctrl & apple_mbox->hw->control_full);
> > > +}
> >
> > If you do the pointer chasing, I suspect you want accessor
> > functions/macros at least to make it less intrusive...
>
> There's regmap but that can't easily be used here because I need 32bit
> and 64bit writes. I also don't really see the advantage of replacing
> this with some custom functions like e.g.
>
> mbox_ctrl = apple_mbox_hw_readl(apple_mbox, apple_mbox->hw->a2i_control);
>
> which is almost as long. And if I introduce a separate function just to read the
> control reg this just becomes a lot of boilerplate and harder to follow.
>
> Or did you have anything else in mind?

I was envisioning a macro:

> #define apple_mbox_readl(mbox, name) \
> readl_relaxed(mbox->regs + mbox->hw-> ## name)
>
> mbox_ctrl = apple_mbox_readl(apple_mbox, a2i_control);

Now that I've typed it out, however, it seems a bit too magical to
justify the minor space savings. And given you need both 32b and 64b
there would be ~4 such macros which is also a lot of boilerplate.

It's not a huge deal either way but I thought I'd raise it.

> > > + dev_dbg(apple_mbox->dev, "> TX %016llx %08x\n", msg->msg0, msg->msg1);
> >
> > Isn't "TX" redundant here?
>
> Sure, but it also doesn't hurt in a debug message. I can spot the TX easier
> but I'm sure there are people who prefer >.

Fair enough.

> > > + dev_dbg(apple_mbox->dev, "got FIFO empty IRQ\n");
> >
> > I realize it's a dev_dbg but this still seems unnecessarily noisy.
>
> This code path is almost never reached. I've only been able to trigger
> it by forcing send_message to return EBUSY even when it could still
> move messages to the FIFO to test this path.
> This also can't be triggered more often than the TX debug message.
> If this triggers more often there's a bug somewhere that kept the interrupt
> enabled and then the whole machine will hang due an interrupt storm anyway. In
> that case I'd prefer to have this as noisy as possible.

Ah! Sure, makes sense. Is it worth adding a few words to the functions
comments indicating this rarely occurs in good conditions?

> > > +static irqreturn_t apple_mbox_recv_irq(int irq, void *data)
> > > +{
> > > + struct apple_mbox *apple_mbox = data;
> > > + struct apple_mbox_msg msg;
> > > +
> > > + while (apple_mbox_hw_can_recv(apple_mbox)) {
> > > + apple_mbox_hw_recv(apple_mbox, &msg);
> > > + mbox_chan_received_data(&apple_mbox->chan, (void *)&msg);
> > > + }
> > ```
> >
> > A comment is warranted why this loop is safe and will always terminate,
> > especially given this is an IRQ context. (Ah... can a malicious
> > coprocessor DoS the AP by sending messages faster than the AP can
> > process them? freezing the system since preemption might be disabled
> > here? I suppose thee's no good way to protect against that level of
> > goofy.)
>
> The only way this can't terminate is if the co-processor tries to stall
> us with messages, but what's your threat model here? If the co-processor wants
> to be evil it usually has many other ways to annoy us (e.g. ANS could just disable
> NVMe, SMC could just trigger a shutdown, etc.)
>
> I could move this to threaded interrupt context though which would make that a bit
> harder to pull off at the cost of a bit more latency from incoming messages.
> Not sure that's worth it though.

Probably not worth it, no.

> >
> > > + * There's no race if a message comes in between the check in the while
> > > + * loop above and the ack below: If a new messages arrives inbetween
> > > + * those two the interrupt will just fire again immediately after the
> > > + * ack since it's level sensitive.
> >
> > I don't quite understand the reasoning. Why have IRQ controls at all
> > then on the M3?
>
> Re-read the two lines just above this one. If the interrupt is not acked here
> it will keep firing even when there are no new messages.
> But it's fine to ack it even when there are message available since it will
> just trigger again immediately.

Got it, thanks.

> > > + /*
> > > + * Only some variants of this mailbox HW provide interrupt control
> > > + * at the mailbox level. We therefore need to handle enabling/disabling
> > > + * interrupts at the main interrupt controller anyway for hardware that
> > > + * doesn't. Just always keep the interrupts we care about enabled at
> > > + * the mailbox level so that both hardware revisions behave almost
> > > + * the same.
> > > + */
> >
> > Cute. Does macOS do this? Are there any tradeoffs?
>
> Not sure if macOS does is and I also don't see a reason to care what it
> does exactly. I've verified that this works with the M3 mailboxes.
> I also don't see why there would there be any tradeoffs.
> Do you have anything specific in mind?
>
> I suspect this HW was used in previous SoCs where all four interrupts
> shared a single line and required this chained interrupt controller
> at the mailbox level. But since they are all separate on the M1 it's
> just a nuisance we have to deal with - especially since the ASC
> variant got rid of the interrupt controls.

Makes sense for a legacy block 👍