Re: [v5,2/3] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
From: Nicolas Boichat
Date: Mon Dec 21 2020 - 22:56:41 EST
On Tue, Dec 22, 2020 at 11:38 AM Jianjun Wang <jianjun.wang@xxxxxxxxxxxx> wrote:
>
> On Mon, 2020-12-21 at 10:18 +0800, Nicolas Boichat wrote:
> > On Wed, Dec 2, 2020 at 9:39 PM Jianjun Wang <jianjun.wang@xxxxxxxxxxxx> wrote:
> > > [snip]
> > > +static irq_hw_number_t mtk_pcie_msi_get_hwirq(struct msi_domain_info *info,
> > > + msi_alloc_info_t *arg)
> > > +{
> > > + struct msi_desc *entry = arg->desc;
> > > + struct mtk_pcie_port *port = info->chip_data;
> > > + int hwirq;
> > > +
> > > + mutex_lock(&port->lock);
> > > +
> > > + hwirq = bitmap_find_free_region(port->msi_irq_in_use, PCIE_MSI_IRQS_NUM,
> > > + order_base_2(entry->nvec_used));
> > > + if (hwirq < 0) {
> > > + mutex_unlock(&port->lock);
> > > + return -ENOSPC;
> > > + }
> > > +
> > > + mutex_unlock(&port->lock);
> > > +
> > > + return hwirq;
> >
> > Code is good, but I had to look twice to make sure the mutex is
> > unlocked. Is the following marginally better?
> >
> > hwirq = ...;
> >
> > mutex_unlock(&port->lock);
> >
> > if (hwirq < 0)
> > return -ENOSPC;
> >
> > return hwirq;
>
> Impressive, I will fix it in the next version, and I think the hwirq can
> be returned directly since it will be a negative value if
> bitmap_find_free_region is failed. The code will be like the following:
>
> hwirq = ...;
>
> mutex_unlock(&port->lock);
>
> return hwirq;
SG, as long as you're okay with returning -ENOMEM instead of -ENOSPC.
But now I'm having doubt if negative return values are ok, as
irq_hw_number_t is unsigned long.
msi_domain_alloc
(https://elixir.bootlin.com/linux/latest/source/kernel/irq/msi.c#L143)
uses it to call irq_find_mapping
(https://elixir.bootlin.com/linux/latest/source/kernel/irq/irqdomain.c#L882)
without check, and I'm not convinced irq_find_mapping will error out
gracefully...
> >
> > > +}
> > > +
> > > [snip]
> > > +static void mtk_pcie_msi_handler(struct irq_desc *desc)
> > > +{
> > > + struct mtk_pcie_msi *msi_info = irq_desc_get_handler_data(desc);
> > > + struct irq_chip *irqchip = irq_desc_get_chip(desc);
> > > + unsigned long msi_enable, msi_status;
> > > + unsigned int virq;
> > > + irq_hw_number_t bit, hwirq;
> > > +
> > > + chained_irq_enter(irqchip, desc);
> > > +
> > > + msi_enable = readl(msi_info->base + PCIE_MSI_ENABLE_OFFSET);
> > > + while ((msi_status = readl(msi_info->base + PCIE_MSI_STATUS_OFFSET))) {
> > > + msi_status &= msi_enable;
> >
> > I don't know much about MSI, but what happens if you have a bit that
> > is set in PCIE_MSI_STATUS_OFFSET register, but not in msi_enable?
>
> If the bit that in PCIE_MSI_STATUS_OFFSET register is set but not in
> msi_enable, it must be an abnormal usage of MSI or something goes wrong,
> it should be ignored in case we can not find the corresponding handler.
>
> > Sounds like you'll just spin-loop forever without acknowledging the
> > interrupt.
>
> The interrupt will be acknowledged in the irq_ack callback of
> mtk_msi_irq_chip, which belongs to the msi_domain.
Let's try to go through it (and please explain to me if I get this wrong).
Say we have:
msi_enable = [PCIE_MSI_ENABLE_OFFSET] = 0x1;
while loop:
msi_status = [PCIE_MSI_STATUS_OFFSET] = 0x3;
msi_status &= msi_enable => msi_status = 0x3 & 0x1 = 0x1;
for_each_set_bit(msi_status) {
do something that presumably will disable the MSI interrupt status?
}
(next loop iteration)
msi_status = [PCIE_MSI_STATUS_OFFSET] = 0x2;
msi_status &= msi_enable => msi_status = 0x2 & 0x1 = 0x0;
for_each_set_bit(msi_status) => does nothing.
msi_status = [PCIE_MSI_STATUS_OFFSET] = 0x2;
(infinite loop)
Basically, I'm wondering if you should replace the while condition
statement with:
while ((msi_status = readl(msi_info->base + PCIE_MSI_STATUS_OFFSET) &
msi_enable))