RE: x86_86 SMP megaraid_mbox hangups and panics.

From: Ju, Seokmann
Date: Wed Apr 12 2006 - 14:35:37 EST


Wednesday, April 12, 2006 1:53 PM, Dormando wrote:
> Can you confirm that you read my whole message? I might be a complete
> idiot, but let me reiterate some highlights and add a few
> more details
> from my last mail:
>
> If I build a 32-bit SMP OR 64-bit UP kernel, the cards will boot and
> work *every time*.
Sorry, I've gone through but, ...

So far, the driver has been working on various x86_64 platforms (both AMD64 and EM64T with more than 4 GB mem.). Is this first time x86_64 kernel on those platforms or, the systems have been running on previous x86_64 kernels?
If you send me console log file, I'll look into it. But, still the issue need to be handled by SE team for further investigation.

I've submitted a patch that addresses one problem in 'megaraid_reset_handler()', but I don't think the fix connected to your issue directly.

Thank you,

Seokmann



> -----Original Message-----
> From: dormando [mailto:dormando@xxxxxxxxx]
> Sent: Wednesday, April 12, 2006 1:53 PM
> To: Ju, Seokmann
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: x86_86 SMP megaraid_mbox hangups and panics.
>
> Ju, Seokmann wrote:
> > Hi,
> >> Most of the time the server hits: "megaraid: probe new
> device" - with
> >> the device information, then hangs and starts the 180 second
> >> countdown:
> >> "megaraid: wait for FW to boot [blah]"
> >> After which I get a VFS panic for not having a root disk.
> > This means the controller is NOT taking any commands from
> the driver at that time.
> > In other words, the F/W is NOT ready to take any command, yet.
> > It sounds like that the controller is NOT in good condition
> for some reason and needs to check sanity of it.
> > You may want to check with LSI logic SE team.
> >
> > Thank you,
>
> Can you confirm that you read my whole message? I might be a complete
> idiot, but let me reiterate some highlights and add a few
> more details
> from my last mail:
>
> If I build a 32-bit SMP OR 64-bit UP kernel, the cards will boot and
> work *every time*.
>
> Most of the time this is a *kernel panic* in megaraid_ack_sequence,
> somewhere through megaraid_isr in megaraid_mbox.c
> *Sometimes* it results in the firmware hanging like you
> mentioned above.
> If I compile the drivers as modules instead of statically, this
> *sometimes* results in the machine booting all the way. Once
> the machine
> boots up, I can give it a good 'ole I/O beatdown and it does
> not flinch.
>
> As in: I boot it once, it hangs. I reboot, it panics. I reboot, it
> panics. I reboot, it works. I have 16 cards in 16 systems that all
> exhibit the same behavior. Looks like an x86_64 SMP timing
> issue of some
> kind to me, and less of a card issue. Please correct me if I am wrong!
>
> -Dormando
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/