Re: [GIT PULL] block bits for 2.6.29-rc5

From: Jens Axboe
Date: Fri Feb 20 2009 - 11:56:32 EST


On Fri, Feb 20 2009, Miller, Mike (OS Dev) wrote:
>
>
> > -----Original Message-----
> > From: Jens Axboe [mailto:jens.axboe@xxxxxxxxxx]
> > Sent: Friday, February 20, 2009 10:41 AM
> > To: Andrew Morton
> > Cc: torvalds@xxxxxxxxxxxxxxxxxxxx;
> > linux-kernel@xxxxxxxxxxxxxxx; Miller, Mike (OS Dev)
> > Subject: Re: [GIT PULL] block bits for 2.6.29-rc5
> >
> > On Thu, Feb 19 2009, Andrew Morton wrote:
> > > On Wed, 18 Feb 2009 15:41:06 +0100
> > > Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> > >
> > > > @@ -3404,6 +3601,24 @@ static int __devinit
> > cciss_init_one(struct pci_dev *pdev,
> > > > int dac, return_code;
> > > > InquiryData_struct *inq_buff = NULL;
> > > >
> > > > + if (reset_devices) {
> > > > + /* Reset the controller with a PCI power-cycle */
> > > > + if (cciss_hard_reset_controller(pdev) ||
> > cciss_reset_msi(pdev))
> > > > + return -ENODEV;
> > > > +
> > > > + /* Some devices (notably the HP Smart Array 5i
> > Controller)
> > > > + need a little pause here */
> > > > + schedule_timeout_uninterruptible(30*HZ);
> > >
> > > little!
> >
> > That does qualify as the understatement of the day :-)
> >
> > > Perhaps we should do a printk("no, your machine is not dead") here.
> >
> > Perhaps we should shrink it to something a little more
> > tolerable and put it in the noop loop instead. 30 seconds is insane...
>
> Some of these controllers do take a long time to recover from the
> reset because the firmware has to re-initialize. The firmware guys
> claim that's only a few seconds but that's not true.
>
> Granted, the 5i is old as dirt. Don't know how many are still out
> there running newer kernels.

So a small improvement would be to do that delay only for 5i. Or how
about just being a little more relaxed, ala the below? It's still 30
seconds in total, but that's now worst case. Will the 5i crap itself if
we attempt to talk to it too soon?

diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index d2cb67b..b5a0611 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -3611,11 +3611,15 @@ static int __devinit cciss_init_one(struct pci_dev *pdev,
schedule_timeout_uninterruptible(30*HZ);

/* Now try to get the controller to respond to a no-op */
- for (i=0; i<12; i++) {
+ for (i=0; i<30; i++) {
if (cciss_noop(pdev) == 0)
break;
- else
- printk("cciss: no-op failed%s\n", (i < 11 ? "; re-trying" : ""));
+
+ schedule_timeout_uninterruptible(HZ);
+ }
+ if (i == 30) {
+ printk(KERN_ERR "cciss: controller seems dead\n");
+ return -EBUSY;
}
}


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/