Re: BUG: spinlock bad magic on CPU#1, irq/39-firewire/245 (v6.18-rc4, ppc64)

From: Takashi Sakamoto

Date: Tue Nov 11 2025 - 10:00:35 EST


Hi,

Thanks for the report, and sorry for your inconvenience.

On Tue, Nov 11, 2025 at 01:41:21PM +0100, Erhard Furtner wrote:
> On 11/9/25 15:17, Erhard Furtner wrote:
> > [...]
> > firewire_ohci 0001:03:0e.0: added OHCI v1.0 device as card 0, 8 IR + 8
> > IT contexts, quirks 0x0
> > BUG: spinlock bad magic on CPU#1, irq/39-firewire/245
> >  lock: 0xc00000001f672618, .magic: 00000000, .owner: irq/39-
> > firewire/245, .owner_cpu: 1
> > CPU: 1 UID: 0 PID: 245 Comm: irq/39-firewire Tainted: G N  6.18.0-rc4-
> > PMacG5 #1 PREEMPTLAZY
> > Tainted: [N]=TEST
> > Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
> > Call Trace:
> > [c000000005dafb20] [c000000000bc054c] __dump_stack+0x30/0x54 (unreliable)
> > [c000000005dafb50] [c000000000bc04e4] dump_stack_lvl+0x98/0xd0
> > [c000000005dafb90] [c0000000000f22a8] spin_dump+0x88/0xb4
> > [c000000005dafc10] [c0000000000f1d4c] do_raw_spin_unlock+0xdc/0x164
> > [c000000005dafc50] [c000000000bf65d0] _raw_spin_unlock+0x18/0x68
> > [c000000005dafc70] [c0003d0013ce1d5c]
> > fw_core_handle_bus_reset+0xa98/0xb64 [firewire_core]
> > [c000000005dafdc0] [c0003d0013d19aec]
> > handle_selfid_complete_event+0x610/0x764 [firewire_ohci]
> > [c000000005dafe80] [c000000000106050] irq_thread_fn+0x40/0x9c
> > [c000000005dafec0] [c000000000105ecc] irq_thread+0x1c0/0x298
> > [c000000005daff60] [c0000000000b5e54] kthread+0x250/0x280
> > [c000000005daffe0] [c00000000000bd30] start_kernel_thread+0x14/0x18
> I bisected the issue. First bad commit is:
>
> # git bisect good
> 7d138cb269dbd2fa9b0da89a9c10503d1cf269d5 is the first bad commit
> commit 7d138cb269dbd2fa9b0da89a9c10503d1cf269d5
> Author: Takashi Sakamoto <o-takashi@xxxxxxxxxxxxx>
> Date: Tue Sep 16 08:47:44 2025 +0900
>
> firewire: core: use spin lock specific to topology map
>
> At present, the operation for read transaction to topology map register
> is
> not protected by any kind of lock primitives. This causes a potential
> problem to result in the mixed content of topology map.
>
> This commit adds and uses spin lock specific to topology map.
>
> Link:
> https://lore.kernel.org/r/20250915234747.915922-4-o-takashi@xxxxxxxxxxxxx
> Signed-off-by: Takashi Sakamoto <o-takashi@xxxxxxxxxxxxx>
>
> drivers/firewire/core-topology.c | 22 ++++++++++++++--------
> drivers/firewire/core-transaction.c | 6 +++++-
> include/linux/firewire.h | 6 +++++-
> 3 files changed, 24 insertions(+), 10 deletions(-)
>
>
> Bisect.log attached.

At present, I suspect the buffer overflow over 'struct
fw_card.topology_map.buffer[256]' and the cause would be unexpected value
of 'self_id_count' variable provided to 'fw_core_handle_bus_reset()'. It
means that in your machine the 1394 OHCI PCI driver computes the
unexpected value.

Would I ask you to retrieve verbose data by the following steps?

Step 1. Applying the following patch to avoid the suspicious buffer
overflow by limiting the pointer cursor within the buffer.

```
diff --git a/drivers/firewire/core-topology.c b/drivers/firewire/core-topology.c
index 2f73bcd5696f..5e66428ec4b0 100644
--- a/drivers/firewire/core-topology.c
+++ b/drivers/firewire/core-topology.c
@@ -442,6 +442,7 @@ static void update_topology_map(__be32 *buffer, size_t buffer_size, int root_nod
{
__be32 *map = buffer;
int node_count = (root_node_id & 0x3f) + 1;
+ size_t count;

memset(map, 0, buffer_size);

@@ -449,7 +450,10 @@ static void update_topology_map(__be32 *buffer, size_t buffer_size, int root_nod
*map++ = cpu_to_be32(be32_to_cpu(buffer[1]) + 1);
*map++ = cpu_to_be32((node_count << 16) | self_id_count);

- while (self_id_count--)
+ count = buffer_size / sizeof(*buffer) - 3;
+ if (self_id_count > 0 && count > self_id_count)
+ count = self_id_count;
+ while (count--)
*map++ = cpu_to_be32p(self_ids++);

fw_compute_block_crc(buffer);

```

Step 2. The value of self_id_count can be retrieved as the part of
'firewire:bus_reset_handle' tracepoint event. Please work with Linux
tracepoints framework[1] and store the event log. I think unbind/bind
operation to firewire-ohci driver is useful[2].

[1] https://docs.kernel.org/trace/events.html
[2] https://lwn.net/Articles/143397/


Thanks

Takashi Sakamoto