[PATCH 1/1] megaraid_{mm,mbox}: fix a bug in reset handler

From: Ju, Seokmann
Date: Wed Apr 12 2006 - 09:10:07 EST


Hi,

This patch has fix for a bug in the 'megaraid_reset_handler()'.

When abort failed, the driver gets reset handleer called. In the reset
handler, driver calls 'scsi_done()' callback for same SCSI command
packet (struct scsi_cmnd) multiple times if there are multiple SCSI
command packet in the pend_list. More over, if there are entry in the
pend_lsit with IOCTL packet associated, the driver returns it to wrong
free_list so that, in turn, the driver could end up with 'NULL pointer
dereference..' during I/O command building with incorrect resource.

Also, the patch contains several minor/cosmetic changes besides this.

Thank you,

Seokmann

Signed-Off By: Seokmann Ju <seokmann.ju@xxxxxxxx>


---
diff -Naur old/Documentation/scsi/ChangeLog.megaraid
new/Documentation/scsi/ChangeLog.megaraid
--- old/Documentation/scsi/ChangeLog.megaraid 2006-04-10
18:04:04.000000000 -0400
+++ new/Documentation/scsi/ChangeLog.megaraid 2006-04-12
09:35:20.939778336 -0400
@@ -1,3 +1,28 @@
+Release Date : Mon Apr 11 12:27:22 EST 2006 - Seokmann Ju
<sju@xxxxxxxx>
+Current Version : 2.20.4.8 (scsi module), 2.20.2.6 (cmm module)
+Older Version : 2.20.4.7 (scsi module), 2.20.2.6 (cmm module)
+
+1. Fixed a bug in megaraid_reset_handler().
+ Customer reported "Unable to handle kernel NULL pointer
dereference
+ at virtual address 00000000" when system goes to reset condition
+ for some reason. It happened randomly.
+ Root Cause: in the megaraid_reset_handler(), there is
possibility not
+ returning pending packets in the pend_list if there are multiple
+ pending packets.
+ Fix: Made the change in the driver so that it will return all
packets
+ in the pend_list.
+
+2. Added change request.
+ As found in the following URL, rmb() only didn't help the
+ problem. I had to increase the loop counter to 0xFFFFFF. (6 F's)
+ http://marc.theaimsgroup.com/?l=linux-scsi&m=110971060502497&w=2
+
+ I attached a patch for your reference, too.
+ Could you check and get this fix in your driver?
+
+ Best Regards,
+ Jun'ichi Nomura
+
Release Date : Fri Nov 11 12:27:22 EST 2005 - Seokmann Ju
<sju@xxxxxxxx>
Current Version : 2.20.4.7 (scsi module), 2.20.2.6 (cmm module)
Older Version : 2.20.4.6 (scsi module), 2.20.2.6 (cmm module)
diff -Naur old/drivers/scsi/megaraid/megaraid_mbox.c
new/drivers/scsi/megaraid/megaraid_mbox.c
--- old/drivers/scsi/megaraid/megaraid_mbox.c 2006-04-10
17:14:22.000000000 -0400
+++ new/drivers/scsi/megaraid/megaraid_mbox.c 2006-04-11
17:32:12.000000000 -0400
@@ -10,7 +10,7 @@
* 2 of the License, or (at your option) any later version.
*
* FILE : megaraid_mbox.c
- * Version : v2.20.4.7 (Nov 14 2005)
+ * Version : v2.20.4.8 (Apr 11 2006)
*
* Authors:
* Atul Mukker <Atul.Mukker@xxxxxxxx>
@@ -2655,32 +2655,48 @@
// Also, reset all the commands currently owned by the driver
spin_lock_irqsave(PENDING_LIST_LOCK(adapter), flags);
list_for_each_entry_safe(scb, tmp, &adapter->pend_list, list) {
-
list_del_init(&scb->list); // from pending list

- con_log(CL_ANN, (KERN_WARNING
- "megaraid: %ld:%d[%d:%d], reset from pending
list\n",
- scp->serial_number, scb->sno,
- scb->dev_channel, scb->dev_target));
+ if (scb->sno >= MBOX_MAX_SCSI_CMDS) {
+ con_log(CL_ANN, (KERN_WARNING
+ "megaraid: IOCTL packet with %d[%d:%d] being
reset\n",
+ scb->sno, scb->dev_channel, scb->dev_target));

- scp->result = (DID_RESET << 16);
- scp->scsi_done(scp);
+ scb->status = -EFAULT;

- megaraid_dealloc_scb(adapter, scb);
+ megaraid_mbox_mm_done(adapter, scb);
+ } else {
+ if (scb->scp == scp) { // Found command
+ con_log(CL_ANN, (KERN_WARNING
+ "megaraid: %ld:%d[%d:%d], reset
from pending list\n",
+ scp->serial_number, scb->sno,
+ scb->dev_channel,
scb->dev_target));
+ } else {
+ con_log(CL_ANN, (KERN_WARNING
+ "megaraid: IO packet with %d[%d:%d]
being reset\n",
+ scb->sno, scb->dev_channel,
scb->dev_target));
+ }
+
+ scb->scp->result = (DID_RESET << 16);
+ scb->scp->scsi_done(scb->scp);
+
+ megaraid_dealloc_scb(adapter, scb);
+ }
}
spin_unlock_irqrestore(PENDING_LIST_LOCK(adapter), flags);

if (adapter->outstanding_cmds) {
con_log(CL_ANN, (KERN_NOTICE
"megaraid: %d outstanding commands. Max wait %d
sec\n",
- adapter->outstanding_cmds, MBOX_RESET_WAIT));
+ adapter->outstanding_cmds,
+ (MBOX_RESET_WAIT + MBOX_RESET_EXT_WAIT)));
}

recovery_window = MBOX_RESET_WAIT + MBOX_RESET_EXT_WAIT;

recovering = adapter->outstanding_cmds;

- for (i = 0; i < recovery_window && adapter->outstanding_cmds;
i++) {
+ for (i = 0; i < recovery_window; i++) {

megaraid_ack_sequence(adapter);

@@ -2689,12 +2705,11 @@
con_log(CL_ANN, (
"megaraid mbox: Wait for %d commands to
complete:%d\n",
adapter->outstanding_cmds,
- MBOX_RESET_WAIT - i));
+ (MBOX_RESET_WAIT + MBOX_RESET_EXT_WAIT)
- i));
}

// bailout if no recovery happended in reset time
- if ((i == MBOX_RESET_WAIT) &&
- (recovering == adapter->outstanding_cmds)) {
+ if (adapter->outstanding_cmds == 0) {
break;
}

@@ -2918,12 +2933,12 @@
wmb();
WRINDOOR(raid_dev, raid_dev->mbox_dma | 0x1);

- for (i = 0; i < 0xFFFFF; i++) {
+ for (i = 0; i < 0xFFFFFF; i++) {
if (mbox->numstatus != 0xFF) break;
rmb();
}

- if (i == 0xFFFFF) {
+ if (i == 0xFFFFFF) {
// We may need to re-calibrate the counter
con_log(CL_ANN, (KERN_CRIT
"megaraid: fast sync command timed out\n"));
@@ -3475,7 +3490,7 @@
adp.drvr_data = (unsigned long)adapter;
adp.pdev = adapter->pdev;
adp.issue_uioc = megaraid_mbox_mm_handler;
- adp.timeout = 300;
+ adp.timeout = MBOX_RESET_WAIT + MBOX_RESET_EXT_WAIT;
adp.max_kioc = MBOX_MAX_USER_CMDS;

if ((rval = mraid_mm_register_adp(&adp)) != 0) {
diff -Naur old/drivers/scsi/megaraid/megaraid_mbox.h
new/drivers/scsi/megaraid/megaraid_mbox.h
--- old/drivers/scsi/megaraid/megaraid_mbox.h 2006-04-10
17:14:22.000000000 -0400
+++ new/drivers/scsi/megaraid/megaraid_mbox.h 2006-04-11
13:45:21.000000000 -0400
@@ -21,8 +21,8 @@
#include "megaraid_ioctl.h"


-#define MEGARAID_VERSION "2.20.4.7"
-#define MEGARAID_EXT_VERSION "(Release Date: Mon Nov 14 12:27:22 EST
2005)"
+#define MEGARAID_VERSION "2.20.4.8"
+#define MEGARAID_EXT_VERSION "(Release Date: Mon Apr 11 12:27:22 EST
2006)"


/*
---

Attachment: megaraid_mm_mbox.patch
Description: megaraid_mm_mbox.patch