Re: [PATCH 1/1] genirq/msi: Dynamic remove/add stroage adapter hits EEH
From: Wen Xiong
Date: Thu Mar 27 2025 - 17:37:14 EST
What about tearing down resources first and then issuing the reset?
This SAS adapter supports dual controller configuration. Normally we
have two adapters in a system.
We config one of them as Primary adapter and another one as Secondary
adapter.
When doing remove operation on primary adapter, the Secondary adapter is
going to be failover and config as primary by adapter firmware. During
failover process, adapter firmware requests the secondary adapter reset,
then sets it as primary adapter.
Secondary adapter failover triggers adapter
reset(ipr_reset_get_unit_check_job()).
[ 940.742698] ipr 0206:a0:00.0: 9070: IOA requested reset -> FW
requested
[ 940.742733] ipr 0206:a0:00.0: Adapter to Adapter Link Failed Due to
SAS Fabric Change [PRC: 17101C25]
[ 940.742768] ipr 0206:a0:00.0: Remote IOA VPID/SN: IBM 57B4001SISIOA
00458021
When secondary adapter doing a reset, we use the same code path as
removing operation. We can’t free irqs for Secondary adapter since
kernel has assigned the irqs for Secondary adapter.
Actually we discussed about "calling pci_free_irq_vectors()" before
doing bist reset when we trying to fix in device driver.
That might cause other problems. It is also not what a user would
expect. For example, if they disabled irq balance and manually setup irq
binding and affinity, if we go and free and reallocate the interrupts
across a reset, this would wipe out those changes, which would not be
expected.
Thanks,
Wendy