Re: [PATCH 14/15] scsi: qla2xxx: add back SRR support

From: Xose Vazquez Perez
Date: Thu Sep 25 2025 - 08:50:12 EST


On 9/8/25 9:10 PM, Tony Battersby wrote:

(target mode)

[...]

I ran into some HBA firmware bugs with QLE2694L firmware 9.06.02 -
9.08.02 where a SRR would cause the HBA to misbehave badly. Since SRRs
are rare and therefore difficult to test, I figured it would be worth
checking for the buggy firmware and disabling SLER with a warning
instead of letting others run into the same problem on the rare
occasion that they get a SRR. This turned out to be difficult because
the firmware version isn't known in the normal NVRAM config routine, so
I added a second NVRAM config routine that is called after the firmware
version is known. It may be necessary to add checks for additional
buggy firmware versions or additional chips that I was not able to
test.

Signed-off-by: Tony Battersby <tonyb@xxxxxxxxxxxxxxx>
---
drivers/scsi/qla2xxx/qla_dbg.c | 1 +
drivers/scsi/qla2xxx/qla_init.c | 1 +
drivers/scsi/qla2xxx/qla_target.c | 1030 ++++++++++++++++++++++++++++
drivers/scsi/qla2xxx/qla_target.h | 81 +++
drivers/scsi/qla2xxx/tcm_qla2xxx.c | 15 +
5 files changed, 1128 insertions(+)

[...]

+ * Return true if the HBA firmware version is known to have bugs that
+ * prevent Sequence Level Error Recovery (SLER) / Sequence Retransmission
+ * Request (SRR) from working.
+ */
+static bool qlt_has_sler_fw_bug(struct qla_hw_data *ha)
+{
+ bool has_sler_fw_bug = false;
+
+ if (IS_QLA2071(ha)) {
+ /*
+ * QLE2694L known bad firmware:
+ * 9.06.02
+ * 9.07.00
+ * 9.08.02
+ * SRRs trigger hundreds of bogus entries in the response
+ * queue and various other problems.
+ *
+ * QLE2694L known good firmware:
+ * 8.08.05
+ * 9.09.00
+ *
+ * QLE2694L unknown firmware:
+ * 9.00.00 - 9.05.xx
+ */
+ if (ha->fw_major_version == 9 &&
+ ha->fw_minor_version >= 6 &&
+ ha->fw_minor_version <= 8)
+ has_sler_fw_bug = true;
+ }
+
+ return has_sler_fw_bug;
+}

[...]

> +/* Update any settings that depend on ha->fw_*_version. */> +void
+qlt_config_nvram_with_fw_version(struct scsi_qla_host *vha)
+{
+ struct qla_hw_data *ha = vha->hw;
+
+ if (!QLA_TGT_MODE_ENABLED())
+ return;
+
+ if (ql2xtgt_tape_enable && qlt_has_sler_fw_bug(ha)) {
+ ql_log(ql_log_warn, vha, 0x11036,
+ "WARNING: ignoring ql2xtgt_tape_enable due to buggy HBA firmware; please upgrade FW\n");
+
+ /* Disable FC Tape support */
+ if (ha->isp_ops->nvram_config == qla81xx_nvram_config) {
+ struct init_cb_81xx *icb =
+ (struct init_cb_81xx *)ha->init_cb;
+ icb->firmware_options_2 &= cpu_to_le32(~BIT_12);
+ } else {
+ struct init_cb_24xx *icb =
+ (struct init_cb_24xx *)ha->init_cb;
+ icb->firmware_options_2 &= cpu_to_le32(~BIT_12);
+ }
+ }
+}

If you want to review the firmware changelog, mainly: FCD-1183 (FCD-371, ER147301), FCD-259, ER146998
(from 9.00.00 to 9.15.05 [06/10/25]):
https://www.marvell.com/content/dam/marvell/en/drivers/2025-06-10-release/fw_release_notes/Fibre_Channel_Firmware_Release_Notes.pdf

It's look like all 2{678}xx devices/chips are affected by this bug.
Perhaps the Marvel crew could provide more information on this.