Re: NULL pointer dereference in msi_set_mask_bit

From: Paul Menzel
Date: Thu Jul 19 2018 - 09:56:14 EST


Dear Thomas,


On 07/19/18 15:48, Thomas Gleixner wrote:

> On Thu, 19 Jul 2018, Paul Menzel wrote:
>
>> I had to copy the files, and then was able to create an archive with
>> non-zero files. Please find the tar archive attached.
>
> Thanks for providing the data. All looks normal there.

Thank you for verifying that.

> Just for clarification. Did this happen exactly once, or did it just not
> happen again after you plugged in a serial cable?

The kernel also panics sometimes when loading the amdgpu module [1]. Before
using the serial console it crashed often, but I do *not* know the reason.

After connecting the serial console, the problem only happened exactly
once.

> One thing you might try is to disable irqbalanced when the machine is up
> and then stress the affinity setter mechanism with scripting.
>
> Something like
>
> while true; do
> for I in {0..3}; do echo $I > /proc/irq/$IRQ/smp_affinity_list; done
> done
>
> might be able to trigger it. But don't ask me which interrupt was involved,
> so you have to iterate through the ones which are MSI based.
>
> 'cat /proc/interrupt | grep MSI' will tell you.

Thank you. I guess Iâll use the machine for some time, and observe if the
problem shows up again.

Thank you very much for your awesome help.


Kind regards,

Paul


[1]: https://bugs.freedesktop.org/show_bug.cgi?id=105684

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature