Re: Kernel crash during resync of raid5 on SMP

From: Otto Meier (gf435@gmx.net)
Date: Thu Mar 08 2001 - 10:19:50 EST


On Thu, 8 Mar 2001 08:55:28 +1100 (EST), Neil Brown wrote:

>On Wednesday March 7, gf435@gmx.net wrote:
>> I run a Dual prozessor SMP system on 2.4.2-ac12 for a while
>> in degraded mode. Today I put in a new disk to switch to
>> full raid5 mode. Shortly after the command raidhotadd the
>> system crashed with the message lost interrupt on cpu1.
>
>Was there an Oops? Can we see? decoded with ksymoops of course.

Unfortunatly I entered this command remotely. The console Display was
off at that time.

>Are you happy to retry? (i.e. raidsetfaulty; raidhotremove,
>raidhotadd). If so, Could you try with 2.4.2?

I would not really like to do that, as of now everything runs fine again for a day.

>Where abouts in the sync-process did it die? Start? end? middle?
>various?

After the first crash I needed to reboot. It crashed again shortly after
boot message that it starts to resync.

This happens several times until I used the kernel parameter MAXcpus=1.
Then it worked without a problem. After resyncing finished I could start
it again in SMP mode and everything worked fine again.

Sorry that can not shed any more light on it.

Otto

>NeilBrown
>
>
>>
>> This continued after reboot. I finaly managed to get it running again
>> by booting with kernel parameter maxcpus=1. In this one CPU mode
>> it finished resycing.
>>
>> During this process I was never able to resync with two CPU's.
>>
>> After finishing rescyncing the system run now fine in SMP Dual mode again.
>>
>> Perhaps there might be an issue with spinlocks during resyncing.
>>
>> Bye Otto
>>
>>
>>
>>
>>
>>
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 15 2001 - 21:00:08 EST