[PATCH 1/1] genirq/msi: Dynamic remove/add stroage adapter hits EEH
From: wenxiong
Date: Wed Mar 19 2025 - 10:52:33 EST
From: Wen Xiong <wenxiong@xxxxxxxxxxxxx>
When enable irqbalance daemon, Dynamic remove/add stroage
adapter(Scsi IPR and FC Qlogic) test hits EEH on PPC.
EEH: [c00000000004f75c] __eeh_send_failure_event+0x7c/0x160
EEH: [c000000000048444] eeh_dev_check_failure.part.0+0x254/0x650
EEH: [c008000001650678] eeh_readl+0x60/0x90 [ipr]
EEH: [c00800000166746c] ipr_cancel_op+0x2b8/0x524 [ipr]
EEH: [c008000001656524] ipr_eh_abort+0x6c/0x130 [ipr]
EEH: [c000000000ab0d20] scmd_eh_abort_handler+0x140/0x440
EEH: [c00000000017e558] process_one_work+0x298/0x590
EEH: [c00000000017eef8] worker_thread+0xa8/0x620
EEH: [c00000000018be34] kthread+0x124/0x130
EEH: [c00000000000cd64] ret_from_kernel_thread+0x5c/0x64
EEH: This PCI device has failed 1 times in the last hour and will be.
We took a pcie bus trace and found out that a vector of msix is clear
to 0 by irqbalance daemon. If we disable irqbalance daemon, we won't
see the issue on both of adapters.
We enabled debug in ipr driver,
[ 44.103071] ipr: Entering __ipr_remove
[ 44.103083] ipr: Entering ipr_initiate_ioa_bringdown
[ 44.103091] ipr: Entering ipr_reset_shutdown_ioa
[ 44.103099] ipr: Leaving ipr_reset_shutdown_ioa
[ 44.103105] ipr: Leaving ipr_initiate_ioa_bringdown
[ 44.149918] ipr: Entering ipr_reset_ucode_download
[ 44.149935] ipr: Entering ipr_reset_alert
[ 44.150032] ipr: Entering ipr_reset_start_timer
[ 44.150038] ipr: Leaving ipr_reset_alert
[ 44.244343] scsi 1:2:3:0: alua: Detached
[ 44.254300] ipr: Entering ipr_reset_start_bist
[ 44.254320] ipr: Entering ipr_reset_start_timer
[ 44.254325] ipr: Leaving ipr_reset_start_bist
[ 44.364329] scsi 1:2:4:0: alua: Detached
[ 45.134341] scsi 1:2:5:0: alua: Detached
[ 45.860949] ipr: Entering ipr_reset_shutdown_ioa
[ 45.860962] ipr: Leaving ipr_reset_shutdown_ioa
[ 45.860966] ipr: Entering ipr_reset_alert
[ 45.861028] ipr: Entering ipr_reset_start_timer
[ 45.861035] ipr: Leaving ipr_reset_alert
[ 45.964302] ipr: Entering ipr_reset_start_bist
[ 45.964309] ipr: Entering ipr_reset_start_timer
[ 45.964313] ipr: Leaving ipr_reset_start_bist
[ 46.264301] ipr: Entering ipr_reset_bist_done
[ 46.264309] ipr: Leaving ipr_reset_bist_done
--->
There is very small window: irqbalance daemon kicks in before ipr driver
calls pci_restore_state(pdev), irqbalance daemon read back all 0 for that
msix vector in __pci_read_msi_msg(). When ipr driver call
pci_restore_state(pdev) in ipr_reset_restore_cfg_space(), the msix vector
has been cleared by irqbalance daemon in pci_write_msg_msix().
Below is MSIX table for ipr adapter after 'irqbalance" dameon kicked in.
Dump MSIx table: index=0 address_lo=c800 address_hi=10000000 msg_data=0
Dump MSIx table: index=1 address_lo=c810 address_hi=10000000 msg_data=0
Dump MSIx table: index=2 address_lo=c820 address_hi=10000000 msg_data=0
Dump MSIx table: index=3 address_lo=c830 address_hi=10000000 msg_data=0
Dump MSIx table: index=4 address_lo=c840 address_hi=10000000 msg_data=0
Dump MSIx table: index=5 address_lo=c850 address_hi=10000000 msg_data=0
Dump MSIx table: index=6 address_lo=c860 address_hi=10000000 msg_data=0
Dump MSIx table: index=7 address_lo=c870 address_hi=10000000 msg_data=0
Dump MSIx table: index=8 address_lo=0 address_hi=0 msg_data=0
-------> hit EEH
Dump MSIx table: index=9 address_lo=c890 address_hi=10000000 msg_data=0
Dump MSIx table: index=10 address_lo=c8a0 address_hi=10000000 msg_data=0
Dump MSIx table: index=11 address_lo=c8b0 address_hi=10000000 msg_data=0
Dump MSIx table: index=12 address_lo=c8c0 address_hi=10000000 msg_data=0
Dump MSIx table: index=13 address_lo=c8d0 address_hi=10000000 msg_data=0
Dump MSIx table: index=14 address_lo=c8e0 address_hi=10000000 msg_data=0
Dump MSIx table: index=15 address_lo=c8f0 address_hi=10000000 msg_data=0
[ 46.264312] ipr: Entering ipr_reset_restore_cfg_space
[ 46.267439] ipr: Entering ipr_fail_all_ops
[ 46.267447] ipr: Leaving ipr_fail_all_ops
[ 46.267451] ipr: Leaving ipr_reset_restore_cfg_space
[ 46.267454] ipr: Entering ipr_ioa_bringdown_done
[ 46.267458] ipr: Leaving ipr_ioa_bringdown_done
[ 46.267467] ipr: Entering ipr_worker_thread
[ 46.267470] ipr: Leaving ipr_worker_thread
irabalance daemon calls this:
In _pci_read_msi_msg(),
void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
{
struct pci_dev *dev = msi_desc_to_pci_dev(entry);
BUG_ON(dev->current_state != PCI_D0);
if (entry->pci.msi_attrib.is_msix) {
void __iomem *base = pci_msix_desc_addr(entry);
if (WARN_ON_ONCE(entry->pci.msi_attrib.is_virtual))
return;
msg->address_lo = readl(base + PCI_MSIX_ENTRY_LOWER_ADDR);
-> it is 0 before calling pci_restore_state()
msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR);
-> it is 0 before calling pci_restore_state()
msg->data = readl(base + PCI_MSIX_ENTRY_DATA);
...
...
}
Then call pseries_msi_write_msg to set 0 to entry->msg.
static void pseries_msi_write_msg(struct irq_data *data,...)
{
struct msi_desc *entry = irq_data_get_msi_desc(data);
entry->msg = *msg;
}
Later ipr driver calls pci_restore_save(pdev)
-> __pci_restore_msix_state()
pci_restore_msix_state(struct pci_dev *dev)
-> pci_write_msg_msix()
static inline void pci_write_msg_msix()
{
.. writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR);
->already clear to 0 by irqbalance daemon
writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR);
->already clear to 0 by irqbalance daemon
writel(msg->data, base + PCI_MSIX_ENTRY_DATA);
}
I tried the following patch and we didn't hit the issue. If you are
familiar with MSI domain code, Please suggest the better solution.
Thanks,
Wendy
Signed-off-by: Wen Xiong <wenxiong@xxxxxxxxxxxxx>
---
kernel/irq/msi.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 396a067a8a56..fcde35efb64c 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -671,7 +671,8 @@ int msi_domain_set_affinity(struct irq_data *irq_data,
if (ret >= 0 && ret != IRQ_SET_MASK_OK_DONE) {
BUG_ON(irq_chip_compose_msi_msg(irq_data, msg));
msi_check_level(irq_data->domain, msg);
- irq_chip_write_msi_msg(irq_data, msg);
+ if ((msg->address_lo != 0) && (msg->address_hi != 0))
+ irq_chip_write_msi_msg(irq_data, msg);
}
return ret;
--
2.43.5