Re: [PATCH 0/2] scsi: smartpqi: fix PCIe hot reset recovery
From: mateusz . nowicki
Date: Thu May 07 2026 - 06:27:48 EST
Hello,
Thank you Laurence, appreciated. I'll add your Tested-by and Reviewed-by
to both patches in v2 if the series needs a respin; otherwise Don or Martin
will pick it up on apply.
Thanks,
Mateusz
On 07.05.2026 03:45, Laurence Oberman wrote:
On Wed, 2026-05-06 at 18:21 -0400, Laurence Oberman wrote:
On Wed, 2026-05-06 at 14:01 +0000, Mateusz Nowicki wrote:
> A PCIe bus reset (e.g. "echo 1 > /sys/bus/pci/devices/<bdf>/reset")
> on a
> controller without FLR support leaves the HPE SR932i-p Gen10+
> unusable
> until reboot: smartpqi registers no pci_error_handlers, so the
> driver
> is not notified, firmware reverts to SIS mode, and all queue
> mappings
> are dropped while the driver still drives PQI.
>
> Patch 1 adds .reset_prepare / .reset_done reusing
> pqi_ofa_ctrl_quiesce() / _unquiesce() / pqi_ctrl_init_resume().
>
> Patch 2 raises SIS_CTRL_READY_RESUME_TIMEOUT_SECS from 90s to 180s,
> matching the cold-boot path; without this patch 1 fails at the SIS
> ready check because firmware boot after reset takes ~125s on the
> SR932i-p Gen10+.
>
> Tested on HPE SR932i-p Gen10+ against Linus' master at
> 74fe02ce122a.
>
> Note: the From: header is my Posteo address because my employer's
> SMTP
> is unavailable for external mailing lists. The Signed-off-by
> carries
> the Microchip attribution.
>
> Mateusz Nowicki (2):
> scsi: smartpqi: add pci_error_handlers for bus reset recovery
> scsi: smartpqi: increase SIS ctrl ready resume timeout to 180s
>
> drivers/scsi/smartpqi/smartpqi_init.c | 47
> +++++++++++++++++++++++++++
> drivers/scsi/smartpqi/smartpqi_sis.c | 2 +-
> 2 files changed, 48 insertions(+), 1 deletion(-)
>
> --
> 2.43.0
>
>
>
Hello
I did reproduce this so I am testing the patches as well.
They look correct to me, I will reply again after testing with a
review.
Thanks
Laurence
[2513778.140012] smartpqi 0000:64:00.0: no heartbeat detected - last
heartbeat count: 4207808511
[2513778.140031] smartpqi 0000:64:00.0: controller offline: reason
code
0x4 (no controller heartbeat detected)
[2513778.141346] sd 1:0:0:0: [sda] tag#549 FAILED Result:
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=18s
[2513778.141355] sd 1:0:0:0: [sda] tag#550 FAILED Result:
"xfs_buf_ioend_handle_error+0xd5/0x3f0 [xfs]" at daddr 0x9f78 len 8
error 5
[2513778.141526] XFS (dm-0): log I/O error -5
Hello
For the series:
I tested the patches and it recovers with them applied.
The patches look good.
Tested-by: Laurence Oberman <loberman@xxxxxxxxxx>
Reviewed-by: Laurence Oberman <loberman@xxxxxxxxxx>