[PATCH] ipmi: si: Fix NULL pointer dereference in start_kcs_transaction()

From: Seiji Nishikawa

Date: Tue Jun 30 2026 - 10:16:55 EST


try_smi_init() allocates new_smi->si_sm and later calls
ipmi_register_smi(), which maps to ipmi_add_smi().

During ipmi_add_smi(), the upper IPMI message handler obtains the initial
BMC device information through __bmc_get_device_id(). This can fail if the
BMC does not return a successful response to the Get Device ID command.

When the BMC returns a nonzero completion code, the device-id helper
retries the command and eventually returns -EIO if the device ID still
cannot be fetched.

On this failure path, ipmi_add_smi() logs "Unable to get the device id" and
goes to out_err_started, where it invokes the lower driver's shutdown
callback. try_smi_init() then logs the returned registration failure:

ipmi_si IPI0001:00: IPMI message handler: Unable to get the device id: -5
ipmi_si IPI0001:00: Unable to register device: error -5

For ipmi_si, the shutdown callback is shutdown_smi(), which cleans up the
SI state machine data, frees smi_info->si_sm, and sets smi_info->si_sm and
smi_info->intf to NULL.

kfree(smi_info->si_sm);
smi_info->si_sm = NULL;

smi_info->intf = NULL;

However, the smi_info can still be reached later. In the observed case, the
redo_bmc_reg work item retried BMC device-id probing after shutdown_smi()
had already cleared smi_info->si_sm. The retry path reached
start_next_msg(), which passed the NULL smi_info->si_sm pointer to the
selected KCS state machine handler:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
Workqueue: events redo_bmc_reg [ipmi_msghandler]
RIP: start_kcs_transaction+0x2c/0x190 [ipmi_si]
Call Trace:
start_next_msg+0x50/0x80 [ipmi_si]
check_start_timer_thread.part.9+0x3b/0x50 [ipmi_si]
sender+0x69/0x80 [ipmi_si]
i_ipmi_request+0x2ac/0x9d0 [ipmi_msghandler]
__get_device_id.isra.29+0xaa/0x180 [ipmi_msghandler]
__bmc_get_device_id+0xef/0x950 [ipmi_msghandler]
redo_bmc_reg+0x52/0x60 [ipmi_msghandler]
process_one_work+0x1a7/0x360

start_next_msg() and smi_event_handler() both invoke the selected SI state
machine handlers with smi_info->si_sm without checking whether the state
machine data has already been freed and cleared.

Add NULL checks before calling into the selected SI state machine handlers.
If the state machine data has already been removed, treat the interface as
idle and avoid dereferencing the cleared state.

Signed-off-by: Seiji Nishikawa <snishika@xxxxxxxxxx>
---
drivers/char/ipmi/ipmi_si_intf.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 9a9d12be9bf7..d13d5024352a 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -320,6 +320,9 @@ static enum si_sm_result start_next_msg(struct smi_info *smi_info)
{
int rv;

+ if (unlikely(!smi_info->si_sm))
+ return SI_SM_IDLE;
+
if (!smi_info->waiting_msg) {
smi_info->curr_msg = NULL;
rv = SI_SM_IDLE;
@@ -800,6 +803,9 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
{
enum si_sm_result si_sm_result;

+ if (unlikely(!smi_info->si_sm))
+ return SI_SM_IDLE;
+
restart:
if (smi_info->si_state == SI_HOSED)
/* Just in case, hosed state is only left from the timeout. */
--
2.54.0