Re: [PATCH] ipmi: si: Fix NULL pointer dereference in start_kcs_transaction()
From: Corey Minyard
Date: Tue Jun 30 2026 - 11:48:24 EST
On Tue, Jun 30, 2026 at 11:06:33PM +0900, Seiji Nishikawa wrote:
> try_smi_init() allocates new_smi->si_sm and later calls
> ipmi_register_smi(), which maps to ipmi_add_smi().
Thank you, the analysis is correct and very good.
I don't think the fix is right, though. I believe the right way to
handle this is to set intf->in_shutdown right after out_err_started:
in ipmi_add_smi(). That was the intent, this case was missed.
And that way it's fixed for the other IPMI interfaces.
Do you agree?
-corey
>
> During ipmi_add_smi(), the upper IPMI message handler obtains the initial
> BMC device information through __bmc_get_device_id(). This can fail if the
> BMC does not return a successful response to the Get Device ID command.
>
> When the BMC returns a nonzero completion code, the device-id helper
> retries the command and eventually returns -EIO if the device ID still
> cannot be fetched.
>
> On this failure path, ipmi_add_smi() logs "Unable to get the device id" and
> goes to out_err_started, where it invokes the lower driver's shutdown
> callback. try_smi_init() then logs the returned registration failure:
>
> ipmi_si IPI0001:00: IPMI message handler: Unable to get the device id: -5
> ipmi_si IPI0001:00: Unable to register device: error -5
>
> For ipmi_si, the shutdown callback is shutdown_smi(), which cleans up the
> SI state machine data, frees smi_info->si_sm, and sets smi_info->si_sm and
> smi_info->intf to NULL.
>
> kfree(smi_info->si_sm);
> smi_info->si_sm = NULL;
>
> smi_info->intf = NULL;
>
> However, the smi_info can still be reached later. In the observed case, the
> redo_bmc_reg work item retried BMC device-id probing after shutdown_smi()
> had already cleared smi_info->si_sm. The retry path reached
> start_next_msg(), which passed the NULL smi_info->si_sm pointer to the
> selected KCS state machine handler:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> Workqueue: events redo_bmc_reg [ipmi_msghandler]
> RIP: start_kcs_transaction+0x2c/0x190 [ipmi_si]
> Call Trace:
> start_next_msg+0x50/0x80 [ipmi_si]
> check_start_timer_thread.part.9+0x3b/0x50 [ipmi_si]
> sender+0x69/0x80 [ipmi_si]
> i_ipmi_request+0x2ac/0x9d0 [ipmi_msghandler]
> __get_device_id.isra.29+0xaa/0x180 [ipmi_msghandler]
> __bmc_get_device_id+0xef/0x950 [ipmi_msghandler]
> redo_bmc_reg+0x52/0x60 [ipmi_msghandler]
> process_one_work+0x1a7/0x360
>
> start_next_msg() and smi_event_handler() both invoke the selected SI state
> machine handlers with smi_info->si_sm without checking whether the state
> machine data has already been freed and cleared.
>
> Add NULL checks before calling into the selected SI state machine handlers.
> If the state machine data has already been removed, treat the interface as
> idle and avoid dereferencing the cleared state.
>
> Signed-off-by: Seiji Nishikawa <snishika@xxxxxxxxxx>
> ---
> drivers/char/ipmi/ipmi_si_intf.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> index 9a9d12be9bf7..d13d5024352a 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -320,6 +320,9 @@ static enum si_sm_result start_next_msg(struct smi_info *smi_info)
> {
> int rv;
>
> + if (unlikely(!smi_info->si_sm))
> + return SI_SM_IDLE;
> +
> if (!smi_info->waiting_msg) {
> smi_info->curr_msg = NULL;
> rv = SI_SM_IDLE;
> @@ -800,6 +803,9 @@ static enum si_sm_result smi_event_handler(struct smi_info *smi_info,
> {
> enum si_sm_result si_sm_result;
>
> + if (unlikely(!smi_info->si_sm))
> + return SI_SM_IDLE;
> +
> restart:
> if (smi_info->si_state == SI_HOSED)
> /* Just in case, hosed state is only left from the timeout. */
> --
> 2.54.0
>