Re: [PATCH] intel_th: core: fix null pointer dereference in intel_th_irq

From: David Arcari

Date: Fri Sep 26 2025 - 16:12:46 EST



Hi Alex,

On 9/26/25 12:19 PM, alex@ash.works wrote:
On 2025-08-25 20:45, David Arcari wrote:
In certain cases intel_th_irq can reference a null entry in
the th->thdev array.  This results in the splat shown below.
The problem is that intel_th_output_enable() can modify the
thdev[] array at the same time intel_th_irq is referencing
the same array.  This can be fixed by disabling interrupts
during the call to intel_th_output_enable().

Hi David,

Thank you for the bug report and rootcausing! Can you please also
detail the sequence of actions by which this is reproduced, so
that I can test my fix and not bother you with a back-and-forth
over-email debugging and also add it to our regression testing?
Doesn't have to be a shell script (although I wouldn't say no
to that), plain english would work in a pinch. If you have the
time, I'm also curious about your use case for intel_th.

Unfortunately, I don't have a great reproducer. I have a system which is afflicted in ~ 1 out of every 100 reboots. Adding debug code made the problem easier to reproduce.


This has eluded our testing for about 10 years, so I'm very
interested in the reproducer.

BUG: kernel NULL pointer dereference, address: 0000000000000304
Oops: Oops: 0000 [#1] SMP NOPTI
RIP: 0010:intel_th_irq+0x26/0x70 [intel_th]

Yes, this is absolutely a bug.

@@ -715,7 +715,9 @@ intel_th_subdevice_alloc(struct intel_th *th,
 int intel_th_output_enable(struct intel_th *th, unsigned int otype)
 {
     struct intel_th_device *thdev;
-    int src = 0, dst = 0;
+    int src = 0, dst = 0, ret = 0;
+
+    disable_irq(th->irq);

     for (src = 0, dst = 0; dst <= th->num_thdevs; src++, dst++) {
         for (; src < ARRAY_SIZE(intel_th_subdevices); src++) {

[...]

@@ -750,16 +752,19 @@ int intel_th_output_enable(struct intel_th *th, unsigned int otype)
             goto found;
     }

+nodev:
+    enable_irq(th->irq);
     return -ENODEV;

 found:
     thdev = intel_th_subdevice_alloc(th, &intel_th_subdevices[src]);
     if (IS_ERR(thdev))
-        return PTR_ERR(thdev);
-
-    th->thdev[th->num_thdevs++] = thdev;
+        ret = PTR_ERR(thdev);
+    else
+        th->thdev[th->num_thdevs++] = thdev;

-    return 0;
+    enable_irq(th->irq);
+    return ret;
 }
 EXPORT_SYMBOL_GPL(intel_th_output_enable);

This is indeed a possible fix, but I believe a little bit of
serialization can be employed here.

I was thinking there was a better approach. Given the situation if you'd like me to test a fix, I can do so,


Lastly, my apologies for tardiness.

No worries.

Best,
-DA


Thanks!
--
Alex