Re: [PATCH] ipmi: Add timeout to unconditional wait in __get_device_id()
From: Matt Fleming
Date: Fri Apr 17 2026 - 12:01:33 EST
On Thu, Apr 16, 2026 at 10:28:50AM -0400, Tony Camuso wrote:
>
> In my testing with updates from the Linus tree, after a BMC cold reset:
> 1. The KCS driver returned -EBUSY to callers (good)
> 2. The watchdog daemon received the error and initiated shutdown
> 3. No D-state hang
>
> My tests, conducted on a Dell PER640, verified that Corey's upstream fixes
> cause the driver to properly return errors instead of blocking.
> At least on that platform.
>
> Which hich low-level driver are you using (KCS, BT, SSIF)?
> The PER640 uses KCS.
> # cat /sys/class/ipmi/ipmi0/device/params 2>/dev/null
> kcs,i/o,0xca8,rsp=4,rsi=1,rsh=0,irq=10,ipmb=32
$ cat /sys/class/ipmi/ipmi0/device/params
kcs,i/o,0xca2,rsp=1,rsi=1,rsh=0,irq=0,ipmb=32
attentions 3
complete_transactions 7080342
events 3
flag_fetches 0
hosed_count 1
idles 25359147
incoming_messages 0
interrupts 0
long_timeouts 264790
short_timeouts 13723711
watchdog_pretimeouts 0
> Actually, no. The 54 commits I backported simply bring my RHEL-9 test kernel
> to parity with the Linus tree, which includes [2] and ...
> cae66f1a1dcd 2026-02-13 corey@xxxxxxxxxxx ipmi:si: Fix check for a misbehaving BMC
Ah, I see we have some machines on v6.18.20 which includes this and
they're still triggering this problem.