Re: [PATCH] irqchip/gic-v3: Fix priority comparison when non-secure priorities are used
From: Alexandru Elisei
Date: Thu Aug 12 2021 - 10:22:59 EST
Hi Marc,
On 8/12/21 2:09 PM, Marc Zyngier wrote:
> On Thu, 12 Aug 2021 12:51:34 +0100,
> Alexandru Elisei <alexandru.elisei@xxxxxxx> wrote:
>> Hi,
>>
>> After re-familiarizing myself with the spec, it starting to look to
>> me like indeed there's something not quite right (read as: totally
>> broken) with my patch.
>>
>> Arm IHI 0069F, the pseudocode for reading ICC_RPR_EL1 (page 11-797),
>> says that the priority returned is unchanged if SCTLR_EL3.FIQ ==
>> 0.
> Sure, but look at what ICC_RPR_EL1 does for FIQ==1:
>
> <quote>
> if HaveEL(EL3) && !IsSecure() && SCR_EL3.FIQ == '1' then
> // A Non-secure GIC access and Group 0 inaccessible to Non-secure.
> if pPriority<7> == '0' then
> // Priority is in Secure half and not visible to Non-secure
> Priority = Zeros();
> elsif !IsOnes(pPriority) then
> // Non-secure access and not idle, so physical priority must be shifted
> pPriority<7:0> = (pPriority AND PRIMask())<6:0>:'0';
>
> return ZeroExtend(pPriority);
> </quote>
>
> See how the the priority is shifted *left* (bits [6:0] end up in
> [7:1])?
Yes, when SCR_EL3.FIQ=1, but gic_nonsecure_priorities is enabled when
SCR_EL3.FIQ=0 (gic_has_group0()). In that case ICC_RPR_EL1 returns (what I assume
to be) the highest priority interrupt from ICC_AP0R_EL1, ICC_AP1R_EL1NS and
ICC_AP1R_EL1S. Isn't that the secure view (or Distributor value) of the priority?
>
>> This means that the ICC_RPR_EL1 read will return the secure view
>> (the value as it is stored by the GIC) of the priority, so for
>> pseudo-nmis it will return (GICD_INT_NMI_PRI >> 1) | 0x80, which
>> definitely != GICD_INT_NMI_PRI.
> That's not my reading of the pseudocode.
>
>> This is further confirmed by this statement on page 4-67:
>>
>> "When GICD_CTLR.DS == 0, [..] For Non-secure access to ICC_PMR_EL1
>> and ICC_RPR_EL1 when SCR_EL3.FIQ == 0: The Secure, unshifted view
>> applies."
>>
>> I don't know how I missed that during testing.
>>
>> Did a quick test on the model with PMU NMIs (GICD_CTRL.DS = 0,
>> SCTLR_EL2.FIQ = 0), gic_handle_nmi() was not being called at all,
> 0? Really????
Yes, I was also very surprised myself, I was certain that I tested my patch on the
model with PMU NMIs. Here's what I did to get this result, I would very much
appreciate anyone pointing out what I did wrong here.
$ git diff
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index e0f4debe64e1..749748df2466 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -623,6 +623,8 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
bool irqs_enabled = interrupts_enabled(regs);
int err;
+ panic("Pseudo-NMI");
+
if (irqs_enabled)
nmi_enter();
@@ -1654,8 +1656,10 @@ static void gic_enable_nmi_support(void)
* be in the non-secure range, we use a different PMR value to mask IRQs
* and the rest of the values that we use remain unchanged.
*/
- if (gic_has_group0() && !gic_dist_security_disabled())
+ if (gic_has_group0() && !gic_dist_security_disabled()) {
+ pr_info("enabling gic_nonsecure_priorities");
static_branch_enable(&gic_nonsecure_priorities);
+ }
static_branch_enable(&supports_pseudo_nmis);
And a truncated log for FVP:
Boot-wrapper v0.2
[..]
[ 0.000000] Kernel command line: earlycon=pl011,0x1c090000 console=ttyAMA0
kpti=off root=/dev/vda irqchip.gicv3_pseudo_nmi=1
[..]
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] GICv3: GIC: Using split EOI/Deactivate mode
[ 0.000000] GICv3: 224 SPIs implemented
[ 0.000000] GICv3: 0 Extended SPIs implemented
[ 0.000000] GICv3: Distributor has no Range Selector support
[ 0.000000] Root IRQ handler: gic_handle_irq
[ 0.000000] GICv3: 16 PPIs implemented
[ 0.000000] GICv3: CPU0: found redistributor 0 region 0:0x000000002f100000
[ 0.000000] GICv3: Pseudo-NMIs enabled using relaxed ICC_PMR_EL1 synchronisation
[ 0.000000] GICv3: enabling gic_nonsecure_priorities
[..]
[ 0.037784] armv8-pmu pmu: hw perfevents: no interrupt-affinity property, guessing.
[ 0.037901] hw perfevents: enabled with armv8_pmuv3 PMU driver, 9 counters
available, using NMIs
[..]
# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
9: 0 0 0 0 GICv3 25 Level vgic
11: 0 0 0 0 GICv3 30 Level kvm guest
ptimer
12: 0 0 0 0 GICv3 27 Level kvm guest
vtimer
13: 83 77 69 88 GICv3 26 Level arch_timer
16: 0 0 0 0 GICv3 92 Level arm-pmu
17: 0 0 0 0 GICv3 93 Level arm-pmu
18: 0 0 0 0 GICv3 94 Level arm-pmu
19: 0 0 0 0 GICv3 95 Level arm-pmu
22: 0 0 0 0 GICv3 47 Level eth0
24: 33 0 0 0 GICv3 37 Level uart-pl011
29: 0 0 0 0 GICv3 36 Level rtc-pl031
31: 138 0 0 0 GICv3 74 Level virtio0
IPI0: 251 334 252 172 Rescheduling interrupts
IPI1: 26 23 10 20 Function call interrupts
IPI2: 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 CPU stop (for crash dump)
interrupts
IPI4: 0 0 0 0 Timer broadcast interrupts
IPI5: 0 0 0 0 IRQ work interrupts
IPI6: 0 0 0 0 CPU wake-up interrupts
Err: 0
# perf record -a -- sleep 5
^C[ perf record: Woken up 1 times to write data ]
[ 6.908820] random: perf: uninitialized urandom read (6 bytes read)
[ 7.596139] random: perf: uninitialized urandom read (6 bytes read)
[ perf record: Captured and wrote 0.013 MB perf.data (24 samples) ]
#
# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
9: 0 0 0 0 GICv3 25 Level vgic
11: 0 0 0 0 GICv3 30 Level kvm guest
ptimer
12: 0 0 0 0 GICv3 27 Level kvm guest
vtimer
13: 546 115 82 132 GICv3 26 Level arch_timer
16: 6 0 0 0 GICv3 92 Level arm-pmu
17: 0 10 0 0 GICv3 93 Level arm-pmu
18: 0 0 4 0 GICv3 94 Level arm-pmu
19: 0 0 0 4 GICv3 95 Level arm-pmu
22: 0 0 0 0 GICv3 47 Level eth0
24: 80 0 0 0 GICv3 37 Level uart-pl011
29: 0 0 0 0 GICv3 36 Level rtc-pl031
31: 195 0 0 0 GICv3 74 Level virtio0
IPI0: 259 367 263 224 Rescheduling interrupts
IPI1: 27 51 16 24 Function call interrupts
IPI2: 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 CPU stop (for crash dump)
interrupts
IPI4: 0 0 0 0 Timer broadcast interrupts
IPI5: 0 0 0 0 IRQ work interrupts
IPI6: 0 0 0 0 CPU wake-up interrupts
Err: 0
#
With this change to the driver:
$ git diff
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index e0f4debe64e1..e58e62ab64fe 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -623,6 +623,8 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs)
bool irqs_enabled = interrupts_enabled(regs);
int err;
+ panic("Pseudo-NMI");
+
if (irqs_enabled)
nmi_enter();
@@ -687,7 +689,7 @@ static asmlinkage void __exception_irq_entry
gic_handle_irq(struct pt_regs *regs
return;
if (gic_supports_nmi() &&
- unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) {
+ unlikely(gic_read_rpr() == ((GICD_INT_NMI_PRI >> 1) | 0x80))) {
gic_handle_nmi(irqnr, regs);
return;
}
@@ -1654,8 +1656,10 @@ static void gic_enable_nmi_support(void)
* be in the non-secure range, we use a different PMR value to mask IRQs
* and the rest of the values that we use remain unchanged.
*/
- if (gic_has_group0() && !gic_dist_security_disabled())
+ if (gic_has_group0() && !gic_dist_security_disabled()) {
+ pr_info("enabling gic_nonsecure_priorities");
static_branch_enable(&gic_nonsecure_priorities);
+ }
static_branch_enable(&supports_pseudo_nmis);
This is what I get from FVP (also truncated):
Boot-wrapper v0.2
[..]
[ 0.000000] Kernel command line: earlycon=pl011,0x1c090000 console=ttyAMA0
kpti=off root=/dev/vda irqchip.gicv3_pseudo_nmi=1
[..]
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] GICv3: GIC: Using split EOI/Deactivate mode
[ 0.000000] GICv3: 224 SPIs implemented
[ 0.000000] GICv3: 0 Extended SPIs implemented
[ 0.000000] GICv3: Distributor has no Range Selector support
[ 0.000000] Root IRQ handler: gic_handle_irq
[ 0.000000] GICv3: 16 PPIs implemented
[ 0.000000] GICv3: CPU0: found redistributor 0 region 0:0x000000002f100000
[ 0.000000] GICv3: Pseudo-NMIs enabled using relaxed ICC_PMR_EL1 synchronisation
[ 0.000000] GICv3: enabling gic_nonsecure_priorities
[..]
[ 0.037749] armv8-pmu pmu: hw perfevents: no interrupt-affinity property, guessing.
[ 0.037860] hw perfevents: enabled with armv8_pmuv3 PMU driver, 9 counters
available, using NMIs
[..]
# perf record -a -- sleep 10
[ 3.556669] Kernel panic - not syncing: Pseudo-NMI
[ 3.556676] CPU: 0 PID: 145 Comm: perf Not tainted 5.14.0-rc5-dirty #23
[ 3.556690] Hardware name: FVP Base (DT)
[ 3.556692] Call trace:
[ 3.556700] dump_backtrace+0x0/0x19c
[ 3.556710] show_stack+0x1c/0x70
[ 3.556725] dump_stack_lvl+0x68/0x84
[ 3.556729] dump_stack+0x1c/0x38
[ 3.556741] panic+0x16c/0x340
[ 3.556750] gic_handle_irq+0x178/0x1a4
[ 3.556760] call_on_irq_stack+0x2c/0x58
[ 3.556776] do_interrupt_handler+0x54/0x60
[ 3.556791] el1_interrupt+0x30/0xa0
[ 3.556800] el1h_64_irq_handler+0x1c/0x2c
[ 3.556809] el1h_64_irq+0x78/0x7c
[ 3.556819] do_vfs_ioctl+0x90/0xd30
[ 3.556830] __arm64_sys_ioctl+0x88/0xf0
[ 3.556842] invoke_syscall+0x48/0x114
[ 3.556849] el0_svc_common+0x48/0x17c
[ 3.556859] do_el0_svc+0x2c/0x94
[ 3.556873] el0_svc+0x2c/0x5c
[ 3.556879] el0t_64_sync_handler+0xa8/0x130
[ 3.556889] el0t_64_sync+0x198/0x19c
[ 3.556907] SMP: stopping secondary CPUs
[ 3.556919] Kernel Offset: 0x57c5cb480000 from 0xffff800010000000
[ 3.556922] PHYS_OFFSET: 0xffffebb580000000
[ 3.556930] CPU features: 0x000000c1,ff3d4eef
[ 3.556940] Memory Limit: none
[ 3.556949] ---[ end Kernel panic - not syncing: Pseudo-NMI ]---
>
>> but when I changed the comparison to gic_read_rpr() ==
>> ((GICD_INT_NMI_PRI >> 1) | 0x80), NMIs were being correctly handled
>> again.
> You have completely lost me. This contradicts what you have written
> above.
I don't see how that is the case - ICC_RPR_EL1 contains the priority value as seen
by the Distributor, and non-secure priorities get right-shifted when
SCR_EL3.FIQ=0, meaning that GICD_INT_NMI_PRI becomes (GICD_INT_NMI_PRI >> 1) |
0x80 in the Distributor. Can you elaborate where I'm contradicting myself?
Thanks,
Alex