Re: [PATCH] irqchip/sifive-plic: add irq_mask and irq_unmask

From: Palmer Dabbelt
Date: Mon Sep 16 2019 - 18:18:04 EST


On Mon, 16 Sep 2019 14:41:07 PDT (-0700), Darius Rad wrote:
On 9/16/19 4:51 PM, Palmer Dabbelt wrote:
On Mon, 16 Sep 2019 12:04:56 PDT (-0700), Darius Rad wrote:
On 9/15/19 2:20 PM, Marc Zyngier wrote:
On Sun, 15 Sep 2019 18:31:33 +0100,
Palmer Dabbelt <palmer@xxxxxxxxxx> wrote:

Hi Palmer,


On Sun, 15 Sep 2019 07:24:20 PDT (-0700), maz@xxxxxxxxxx wrote:
On Thu, 12 Sep 2019 22:40:34 +0100,
Darius Rad <darius@xxxxxxxxxxxx> wrote:

Hi Darius,


As per the existing comment, irq_mask and irq_unmask do not need
to do anything for the PLIC. However, the functions must exist
(the pointers cannot be NULL) as they are not optional, based on
the documentation (Documentation/core-api/genericirq.rst) as well
as existing usage (e.g., include/linux/irqchip/chained_irq.h).

Signed-off-by: Darius Rad <darius@xxxxxxxxxxxx>
---
Âdrivers/irqchip/irq-sifive-plic.c | 13 +++++++++----
Â1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index cf755964f2f8..52d5169f924f 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -111,6 +111,13 @@ static void plic_irq_disable(struct irq_data *d)
ÂÂÂÂ plic_irq_toggle(cpu_possible_mask, d->hwirq, 0);
Â}
Â+/*
+ * There is no need to mask/unmask PLIC interrupts. They are "masked"
+ * by reading claim and "unmasked" when writing it back.
+ */
+static void plic_irq_mask(struct irq_data *d) { }
+static void plic_irq_unmask(struct irq_data *d) { }

This outlines a bigger issue. If your irqchip doesn't require
mask/unmask, you're probably not using the right interrupt
flow. Looking at the code, I see you're using handle_simple_irq, which
is almost universally wrong.

As per the description above, these interrupts should be using the
fasteoi flow, which is designed for this exact behaviour (the
interrupt controller knows which interrupt is in flight and doesn't
require SW to do anything bar signalling the EOI).

Another thing is that mask/unmask tends to be a requirement, while
enable/disable tends to be optional. There is no hard line here, but
the expectations are that:

(a) A disabled line can drop interrupts
(b) A masked line cannot drop interrupts

Depending what the PLIC architecture mandates, you'll need to
implement one and/or the other. Having just (a) is indicative of a HW
bug, and I'm not assuming that this is the case. (b) only is pretty
common, and (a)+(b) has a few adepts. My bet is that it requires (b)
only.

+
Â#ifdef CONFIG_SMP
Âstatic int plic_set_affinity(struct irq_data *d,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ const struct cpumask *mask_val, bool force)
@@ -138,12 +145,10 @@ static int plic_set_affinity(struct irq_data *d,
 static struct irq_chip plic_chip = {
ÂÂÂÂ .nameÂÂÂÂÂÂÂ = "SiFive PLIC",
-ÂÂÂ /*
- * There is no need to mask/unmask PLIC interrupts. They are "masked"
-ÂÂÂÂ * by reading claim and "unmasked" when writing it back.
-ÂÂÂÂ */
ÂÂÂÂ .irq_enableÂÂÂ = plic_irq_enable,
ÂÂÂÂ .irq_disableÂÂÂ = plic_irq_disable,
+ÂÂÂ .irq_maskÂÂÂ = plic_irq_mask,
+ÂÂÂ .irq_unmaskÂÂÂ = plic_irq_unmask,
Â#ifdef CONFIG_SMP
ÂÂÂÂ .irq_set_affinity = plic_set_affinity,
Â#endif

Can you give the following patch a go? It brings the irq flow in line
with what the HW can do. It is of course fully untested (not even
compile tested...).

Thanks,

ÂÂÂÂM.

From c0ce33a992ec18f5d3bac7f70de62b1ba2b42090 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@xxxxxxxxxx>
Date: Sun, 15 Sep 2019 15:17:45 +0100
Subject: [PATCH] irqchip/sifive-plic: Switch to fasteoi flow

The SiFive PLIC interrupt controller seems to have all the HW
features to support the fasteoi flow, but the driver seems to be
stuck in a distant past. Bring it into the 21st century.

Thanks. We'd gotten these comments during the review process but
nobody had gotten the time to actually fix the issues.

No worries. The IRQ subsystem is an acquired taste... ;-)


Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
---
Âdrivers/irqchip/irq-sifive-plic.c | 29 +++++++++++++++--------------
Â1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index cf755964f2f8..8fea384d392b 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -97,7 +97,7 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
ÂÂÂÂ }
Â}
Â-static void plic_irq_enable(struct irq_data *d)
+static void plic_irq_mask(struct irq_data *d)

Of course, this is wrong. The perks of trying to do something at the
last minute while boarding an airplane. Don't do that.

This should of course read "plic_irq_unmask"...

Â{
ÂÂÂÂ unsigned int cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ cpu_online_mask);
@@ -106,7 +106,7 @@ static void plic_irq_enable(struct irq_data *d)
ÂÂÂÂ plic_irq_toggle(cpumask_of(cpu), d->hwirq, 1);
Â}
Â-static void plic_irq_disable(struct irq_data *d)
+static void plic_irq_unmask(struct irq_data *d)

... and this should be "plic_irq_mask".

[...]

Reviewed-by: Palmer Dabbelt <palmer@xxxxxxxxxx>
Tested-by: Palmer Dabbelt <palmer@xxxxxxxxxx> (QEMU Boot)

Huhuh... It may be that QEMU doesn't implement the full-fat PLIC, as
the above bug should have kept the IRQ lines masked.

We should test them on the hardware, but I don't have any with me
right now. David's probably in the best spot to do this, as he's got
a setup that does all the weird interrupt sources (ie, PCIe).

David: do you mind testing this? I've put the patch here:

ÂÂ ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git
ÂÂ -b plic-fasteoi

I've pushed out a branch with the fixed patch:

git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git irq/plic-fasteoi


That patch works for me on real-ish hardware. I tried on two FPGA
systems that have different PLIC implementations. Both include
a PCIe root port (and associated interrupt source). So for
whatever it's worth:

Tested-by: Darius Rad <darius@xxxxxxxxxxxx>

Awesome, thanks. Would it be OK to put a "(on two hardware PLIC implementations)" after that, just so we're clear as to who tested what?

Fine by me.


Assuming one of yours wasn't a SiFive PLIC then it'd be great if David could still give this a whack, but I don't think it strictly needs to block merging the patch. I've included the right David this time, with any luck that will be more fruitful :)

One of the systems I tested was based on rocket-chip, and the
associated PLIC, which I guess is the SiFive PLIC, right? Can't hurt
to have more testing, though.

Ya, that's ours.




Thanks,

ÂÂÂÂM.