[PATCH AUTOSEL 6.19-6.18] mailbox: mchp-ipc-sbi: fix out-of-bounds access in mchp_ipc_get_cluster_aggr_irq()

From: Sasha Levin

Date: Sun Feb 15 2026 - 12:43:21 EST

From: Valentina Fernandez <valentina.fernandezalanis@xxxxxxxxxxxxx>

[ Upstream commit f7c330a8c83c9b0332fd524097eaf3e69148164d ]

The cluster_cfg array is dynamically allocated to hold per-CPU
configuration structures, with its size based on the number of online
CPUs. Previously, this array was indexed using hartid, which may be
non-contiguous or exceed the bounds of the array, leading to
out-of-bounds access.
Switch to using cpuid as the index, as it is guaranteed to be within
the valid range provided by for_each_online_cpu().

Signed-off-by: Valentina Fernandez <valentina.fernandezalanis@xxxxxxxxxxxxx>
Reviewed-by: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
Signed-off-by: Jassi Brar <jassisinghbrar@xxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---

LLM Generated explanations, may be completely bogus:

The driver was introduced in v6.14-rc1. This means it's only present in
very recent kernel trees — 6.14 and later. It would only need
backporting to 6.14.y and potentially 6.15.y if those exist as stable
branches.

Now let me also check the ISR fix more carefully. I notice the current
(unfixed) code at line 183 reads:

```c
if (irq == ipc->cluster_cfg[hartid].irq)
```

But the diff shows the fix changes this to `ipc->cluster_cfg[i].irq`
where `i` is the `for_each_online_cpu` iterator. This is correct — `i`
will always be within `[0, num_online_cpus())`.

## Summary of Findings

### What the commit fixes
An **out-of-bounds array access** in the Microchip IPC SBI mailbox
driver. The `cluster_cfg` array is allocated with `num_online_cpus()`
elements but was indexed using RISC-V `hartid`, which can exceed array
bounds when hart IDs are non-contiguous. This affects both the setup
path (`mchp_ipc_get_cluster_aggr_irq`) and the interrupt handler
(`mchp_ipc_cluster_aggr_isr`).

### Bug severity
- **Out-of-bounds write** in probe path: memory corruption
- **Out-of-bounds read** in IRQ context: potential crash/panic
- This is a real, triggerable bug on any RISC-V system with non-
contiguous hart IDs using Microchip IPC hardware

### Stable kernel criteria
1. **Obviously correct and tested**: Yes — the fix simply changes array
indexing from hartid to cpuid, which is guaranteed in-bounds.
Reviewed by Conor Dooley (Microchip).
2. **Fixes a real bug**: Yes — out-of-bounds memory access
3. **Important issue**: Yes — memory corruption and potential kernel
crash
4. **Small and contained**: Yes — only changes one file, replacing
`hartid` with `cpuid`/`i` in array accesses
5. **No new features**: Correct, pure bug fix
6. **Applies cleanly**: Straightforward change to a single file

### Risk assessment
- **Very low risk** — the change is mechanical substitution of the array
index from hartid to cpuid
- The fix is purely within one driver file
- No behavioral change other than fixing the out-of-bounds access

### Concerns
- The driver was only introduced in v6.14-rc1, so it's only relevant for
6.14.y stable trees and newer
- No dependency on other commits; self-contained fix

## Verification

- **Read the full driver source** (`drivers/mailbox/mailbox-mchp-ipc-
sbi.c`): Confirmed the allocation at line 465 uses `num_online_cpus()`
and the buggy accesses use `hartid` as index
- **Verified driver introduction**: `git log --diff-filter=A` shows the
driver was added in commit e4b1d67e71419, which is `v6.14-rc1~97^2~10`
— only in 6.14+
- **Confirmed the bug mechanism**: RISC-V hartids can be non-contiguous
(this is well-known in the RISC-V kernel community), so `hartid >=
num_online_cpus()` is a realistic scenario
- **Confirmed the ISR is also affected**: Line 183 in the unfixed code
accesses `cluster_cfg[hartid]` in interrupt context
- **Confirmed the fix is correct**: `cpuid` from `for_each_online_cpu()`
is always in range `[0, num_online_cpus())`
- **Verified the patch is reviewed**: "Reviewed-by: Conor Dooley"
(Microchip kernel developer)

This is a clear, small, correct fix for a real out-of-bounds memory
access bug that can cause memory corruption or kernel crashes. It meets
all stable kernel criteria.

**YES**

drivers/mailbox/mailbox-mchp-ipc-sbi.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/mailbox/mailbox-mchp-ipc-sbi.c b/drivers/mailbox/mailbox-mchp-ipc-sbi.c
index a6e52009a4245..d444491a584e8 100644
--- a/drivers/mailbox/mailbox-mchp-ipc-sbi.c
+++ b/drivers/mailbox/mailbox-mchp-ipc-sbi.c
@@ -180,20 +180,20 @@ static irqreturn_t mchp_ipc_cluster_aggr_isr(int irq, void *data)
/* Find out the hart that originated the irq */
for_each_online_cpu(i) {
hartid = cpuid_to_hartid_map(i);
- if (irq == ipc->cluster_cfg[hartid].irq)
+ if (irq == ipc->cluster_cfg[i].irq)
break;
}

status_msg.cluster = hartid;
- memcpy(ipc->cluster_cfg[hartid].buf_base, &status_msg, sizeof(struct mchp_ipc_status));
+ memcpy(ipc->cluster_cfg[i].buf_base, &status_msg, sizeof(struct mchp_ipc_status));

- ret = mchp_ipc_sbi_send(SBI_EXT_IPC_STATUS, ipc->cluster_cfg[hartid].buf_base_addr);
+ ret = mchp_ipc_sbi_send(SBI_EXT_IPC_STATUS, ipc->cluster_cfg[i].buf_base_addr);
if (ret < 0) {
dev_err_ratelimited(ipc->dev, "could not get IHC irq status ret=%d\n", ret);
return IRQ_HANDLED;
}

- memcpy(&status_msg, ipc->cluster_cfg[hartid].buf_base, sizeof(struct mchp_ipc_status));
+ memcpy(&status_msg, ipc->cluster_cfg[i].buf_base, sizeof(struct mchp_ipc_status));

/*
* Iterate over each bit set in the IHC interrupt status register (IRQ_STATUS) to identify
@@ -385,21 +385,21 @@ static int mchp_ipc_get_cluster_aggr_irq(struct mchp_ipc_sbi_mbox *ipc)
if (ret <= 0)
continue;

- ipc->cluster_cfg[hartid].irq = ret;
- ret = devm_request_irq(ipc->dev, ipc->cluster_cfg[hartid].irq,
+ ipc->cluster_cfg[cpuid].irq = ret;
+ ret = devm_request_irq(ipc->dev, ipc->cluster_cfg[cpuid].irq,
mchp_ipc_cluster_aggr_isr, IRQF_SHARED,
"miv-ihc-irq", ipc);
if (ret)
return ret;

- ipc->cluster_cfg[hartid].buf_base = devm_kmalloc(ipc->dev,
- sizeof(struct mchp_ipc_status),
- GFP_KERNEL);
+ ipc->cluster_cfg[cpuid].buf_base = devm_kmalloc(ipc->dev,
+ sizeof(struct mchp_ipc_status),
+ GFP_KERNEL);

- if (!ipc->cluster_cfg[hartid].buf_base)
+ if (!ipc->cluster_cfg[cpuid].buf_base)
return -ENOMEM;

- ipc->cluster_cfg[hartid].buf_base_addr = __pa(ipc->cluster_cfg[hartid].buf_base);
+ ipc->cluster_cfg[cpuid].buf_base_addr = __pa(ipc->cluster_cfg[cpuid].buf_base);

irq_found = true;
}
--
2.51.0