Re: [PATCH V3] irqchip/gic-v3: Workaround Marvell erratum 38545 when reading IAR

From: Marc Zyngier
Date: Wed Mar 09 2022 - 12:50:47 EST


On 2022-03-09 17:40, Qian Cai wrote:
On Mon, Mar 07, 2022 at 08:00:14PM +0530, Linu Cherian wrote:
When a IAR register read races with a GIC interrupt RELEASE event,
GIC-CPU interface could wrongly return a valid INTID to the CPU
for an interrupt that is already released(non activated) instead of 0x3ff.

As a side effect, an interrupt handler could run twice, once with
interrupt priority and then with idle priority.

As a workaround, gic_read_iar is updated so that it will return a
valid interrupt ID only if there is a change in the active priority list
after the IAR read on all the affected Silicons.

Since there are silicon variants where both 23154 and 38545 are applicable,
workaround for erratum 23154 has been extended to address both of them.

Signed-off-by: Linu Cherian <lcherian@xxxxxxxxxxx>

Reverting this commit from today's linux-next fixed global-out-of-bounds
accesses running CPU hotplug workloads on a non-ThunderX server.

psci: CPU88 killed (polled 0 ms)
==================================================================
BUG: KASAN: global-out-of-bounds in is_affected_midr_range_list
Read of size 4 at addr ffffa0ec80ddcc6c by task swapper/88/0

CPU: 88 PID: 0 Comm: swapper/88 Not tainted 5.17.0-rc7-next-20220309-dirty #25
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
print_address_description.constprop.0
print_report
kasan_report
__asan_report_load4_noabort
is_affected_midr_range_list
is_midr_in_range_list at ./arch/arm64/include/asm/cputype.h:221
(inlined by) is_affected_midr_range_list at arch/arm64/kernel/cpu_errata.c:41
verify_local_cpu_caps
verify_local_cpu_caps at arch/arm64/kernel/cpufeature.c:2787
check_local_cpu_capabilities
verify_local_elf_hwcaps at arch/arm64/kernel/cpufeature.c:2852
(inlined by) verify_local_cpu_capabilities at
arch/arm64/kernel/cpufeature.c:2922
(inlined by) check_local_cpu_capabilities at
arch/arm64/kernel/cpufeature.c:2948
secondary_start_kernel
__secondary_switched

The buggy address belongs to the variable:
cavium_erratum_23154_cpus

The buggy address belongs to the virtual mapping at
[ffffa0ec80dd0000, ffffa0ec82140000) created by:
map_kernel

Urgh... Thanks for reporting this.

Will, can you either drop this patch, or squash the following
diff in?

Thanks,

M.

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 1d9d4f910de7..400a1c9cac90 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -225,6 +225,7 @@ const struct midr_range cavium_erratum_23154_cpus[] = {
MIDR_ALL_VERSIONS(MIDR_OCTX2_95XXN),
MIDR_ALL_VERSIONS(MIDR_OCTX2_95XXMM),
MIDR_ALL_VERSIONS(MIDR_OCTX2_95XXO),
+ {},
};
#endif


--
Jazz is not dead. It just smells funny...