Re: Reported regressions for 4.7 as of Sunday, 2016-06-19
From: Johannes Thumshirn
Date: Wed Jun 22 2016 - 08:04:39 EST
On Tue, Jun 21, 2016 at 09:25:18PM -0400, Martin K. Petersen wrote:
> >>>>> "Linus" == Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1348342
>
> This first one appears to be a crash in a USB sound doodad and not
> qla2xxx. Also, this appears to be where the 4.5.5 -> 4.5.6 notion comes
> from. So we can probably ignore 4.5.5 as the last good revision.
>
> Linus> as far as I can tell. And neither of them looks very likely, but
> Linus> what do I know. Adding Martin Petersen and Johannes Thumshirn to
> Linus> the participants just in case they go "Ahh.."
>
> Doubt it's Johannes' tweak. The qla2xxx crash from the two other
> bugzilla entries is in:
>
> (gdb) list *qla24xx_process_response_queue+0x49
> 0x27e09 is in qla24xx_process_response_queue (drivers/scsi/qla2xxx/qla_isr.c:2560).
> 2555 if (rsp->msix->cpuid != smp_processor_id()) {
> 2556 /* if kernel does not notify qla of IRQ's CPU change,
> 2557 * then set it here.
> 2558 */
> 2559 rsp->msix->cpuid = smp_processor_id();
> 2560 ha->tgt.rspq_vector_cpuid = rsp->msix->cpuid;
> 2561 }
> 2562
> 2563 while (rsp->ring_ptr->signature != RESPONSE_PROCESSED) {
> 2564 pkt = (struct sts_entry_24xx *)rsp->ring_ptr;
>
> That particular code went into 4.5 and comes from:
>
> commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
> Author: Quinn Tran <quinn.tran@xxxxxxxxxx>
> Date: Thu Dec 17 14:57:05 2015 -0500
>
> qla2xxx: Add irq affinity notification
>
> Register to receive notification of when irq setting change
> occured.
>
> Signed-off-by: Quinn Tran <quinn.tran@xxxxxxxxxx>
> Signed-off-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx>
>
> Quinn?
>
> --
> Martin K. Petersen Oracle Linux Engineering
Having a quick look at it I _think_ this could be the problem.
We request the IRQ _before_ actually assigning the rsp->msix entry. Now If an
IRQ triggers, before the assignment we touch rsp->msix->cpuid, which is
probably the case. At least from what I conduct from Martin's mail.
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 5649c20..20743a3 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -3086,6 +3086,8 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
/* Enable MSI-X vectors for the base queue */
for (i = 0; i < 2; i++) {
qentry = &ha->msix_entries[i];
+ qentry->rsp = rsp;
+ rsp->msix = qentry;
if (IS_P3P_TYPE(ha))
ret = request_irq(qentry->vector,
qla82xx_msix_entries[i].handler,
@@ -3097,8 +3099,6 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
if (ret)
goto msix_register_fail;
qentry->have_irq = 1;
- qentry->rsp = rsp;
- rsp->msix = qentry;
/* Register for CPU affinity notification. */
irq_set_affinity_notifier(qentry->vector, &qentry->irq_notify);
@@ -3119,12 +3119,12 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
*/
if (QLA_TGT_MODE_ENABLED() && IS_ATIO_MSIX_CAPABLE(ha)) {
qentry = &ha->msix_entries[ATIO_VECTOR];
+ qentry->rsp = rsp;
+ rsp->msix = qentry;
ret = request_irq(qentry->vector,
qla83xx_msix_entries[ATIO_VECTOR].handler,
0, qla83xx_msix_entries[ATIO_VECTOR].name, rsp);
qentry->have_irq = 1;
- qentry->rsp = rsp;
- rsp->msix = qentry;
}
msix_register_fail:
I'm not sure if we need the qentry->have_irq assingment as well, I'm not
deep enough into the qla2xx driver yet, maybe Quinn can clarify.
Beware of the above change being untested.
Byte,
Johannes
--
Johannes Thumshirn Storage
jthumshirn@xxxxxxx +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850