Re: Areca hardware RAID / first-ever SCSI bus reset: am I about to lose this disk controller?

From: Nix
Date: Thu Sep 20 2012 - 02:51:01 EST


On 19 Sep 2012, Stan Hoeppner stated:

> On 9/19/2012 1:52 PM, Nix wrote:
>> So I have this x86-64 server running Linux 3.5.1
>
> When did you install 3.5.1 on this machine?

Forty days ago.

> If fairly recently, does it
> run without these errors when booted into the previous kernel?

Well, since this error happened only once, and after thirty-nine days of
uptime at that, I'm not sure how I can find out. :)

>> with a SATA-on-PCIe
>> Areca 1210 hardware RAID-5 controller driven by libata which has been
>> humming along happily for years -- but suddenly, today, the entire
>> machine froze for a couple of minutes (or at least fs access froze),
>> followed by this in the logs:
>>
>> Sep 19 16:55:47 spindle notice: [3447524.381843] arcmsr0: abort device command of scsi id = 0 lun = 1
>> [... repeated a few times at intervals over the next five minutes,
>> followed by a mass of them at 16:59:29, and...]
>> Sep 19 16:59:25 spindle err: [3447657.821450] arcmsr: executing bus reset eh.....num_resets = 0, num_aborts = 33
>> Sep 19 16:59:25 spindle notice: [3447697.878386] arcmsr0: wait 'abort all outstanding command' timeout
>> Sep 19 16:59:25 spindle notice: [3447697.878628] arcmsr0: executing hw bus reset .....
>> Sep 19 16:59:25 spindle err: [3447698.287054] irq 16: nobody cared (try booting with the "irqpoll" option)
>> Sep 19 16:59:25 spindle warning: [3447698.287291] Pid: 0, comm: swapper/4 Not tainted 3.5.1-dirty #1
>> Sep 19 16:59:25 spindle warning: [3447698.287522] Call Trace:
>> Sep 19 16:59:25 spindle warning: [3447698.287754] <IRQ> [<ffffffff810af5ba>] __report_bad_irq+0x31/0xc2
>> Sep 19 16:59:25 spindle warning: [3447698.288031] [<ffffffff810af84e>] note_interrupt+0x16a/0x1e8
>> Sep 19 16:59:25 spindle warning: [3447698.288263] [<ffffffff810ad9d5>] handle_irq_event_percpu+0x163/0x1a5
>> Sep 19 16:59:25 spindle warning: [3447698.288497] [<ffffffff810ada4f>] handle_irq_event+0x38/0x55
>> Sep 19 16:59:25 spindle warning: [3447698.288727] [<ffffffff810b01a0>] handle_fasteoi_irq+0x78/0xab
>> Sep 19 16:59:25 spindle warning: [3447698.288960] [<ffffffff8103631c>] handle_irq+0x24/0x2a
>> Sep 19 16:59:25 spindle warning: [3447698.289189] [<ffffffff81036229>] do_IRQ+0x4d/0xb4
>> Sep 19 16:59:25 spindle warning: [3447698.289419] [<ffffffff815070e7>] common_interrupt+0x67/0x67
>> Sep 19 16:59:25 spindle warning: [3447698.289648] <EOI> [<ffffffff812ab174>] ? acpi_idle_enter_c1+0xcb/0xf2
>> Sep 19 16:59:25 spindle warning: [3447698.289919] [<ffffffff812ab152>] ? acpi_idle_enter_c1+0xa9/0xf2
>> Sep 19 16:59:25 spindle warning: [3447698.290152] [<ffffffff813c1446>] cpuidle_enter+0x12/0x14
>> Sep 19 16:59:25 spindle warning: [3447698.290382] [<ffffffff813c1902>] cpuidle_idle_call+0xc5/0x175
>> Sep 19 16:59:25 spindle warning: [3447698.290614] [<ffffffff8103c2da>] cpu_idle+0x5b/0xa5
>> Sep 19 16:59:25 spindle warning: [3447698.290844] [<ffffffff81ad4fcb>] start_secondary+0x1a2/0x1a6
>> Sep 19 16:59:25 spindle err: [3447698.291074] handlers:
>> Sep 19 16:59:25 spindle err: [3447698.291294] [<ffffffff8133b9a3>] usb_hcd_irq
>> Sep 19 16:59:25 spindle emerg: [3447698.291553] Disabling IRQ #16
>> Sep 19 16:59:25 spindle err: [3447710.888187] arcmsr0: waiting for hw bus reset return, retry=0
>> Sep 19 16:59:25 spindle err: [3447720.882155] arcmsr0: waiting for hw bus reset return, retry=1
>> Sep 19 16:59:25 spindle notice: [3447730.896410] Areca RAID Controller0: F/W V1.46 2009-01-06 & Model ARC-1210
>> Sep 19 16:59:25 spindle err: [3447730.916348] arcmsr: scsi bus reset eh returns with success

--
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/