Re: RAID extremely slow

From: CoolCold
Date: Wed Jul 25 2012 - 22:10:32 EST


On Thu, Jul 26, 2012 at 5:55 AM, Kevin Ross <kevin@xxxxxxxxxxxxxx> wrote:
>
> Thank you very much for taking the time to look into this.
>
>
> On 07/25/2012 06:00 PM, Phil Turmel wrote:
>>
>> Piles of small reads scattered across multiple drives, and a
>> concentration of queued writes to /dev/sda. What's on /dev/sda?
>> It's not a member of the raid, so it must be some other system task
>> involved.
>
>
> /dev/sda1 is the root filesystem. The writes were most likely by MySQL,
> but I would have to run iotop to be sure.
>
>
>> [ The output of "lsdrv" [1] might be useful here, along with
>> "mdadm -D /dev/md0" and "mdadm -E /dev/[b-j]" ]
>
>
> Here you go: http://pastebin.ca/2174740
>
>
>> MythTV is trying to flush recorded video to disk, I presume. Sync is
>> known to cause stalls--a great deal of work is on-going to improve
>> this. How old is this kernel?
>
>
> After rebooting, MythTV is currently recording two shows, and the resync
> is running at full speed.
>
>
> # cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid6 sdh1[0] sdd1[9] sde1[10] sdb1[6] sdi1[7] sdc1[4]
> sdf1[3] sdg1[8] sdj1[1]
> 6837311488 blocks super 1.2 level 6, 512k chunk, algorithm 2 [9/9]
> [UUUUUUUUU]
> [=>...................] resync = 9.3% (91363840/976758784)
> finish=1434.3min speed=10287K/sec
>
> unused devices: <none>
>
> atop shows the avio of all the drives to be less than 1ms, where before
> they were much higher. It will run for a couple days under load just fine,
> and then it will come to a halt.
>
> It's a 3.2.21 kernel. I'm running Debian Testing, and the exact Debian
> package version is:
>
> ii linux-image-3.2.0-3-686-pae 3.2.21-3
> Linux 3.2 for modern PCs
>
>
>>
>>> [51000.672258] [<c12c409f>] ? sysenter_do_call+0x12/0x28
>>> [51000.672261] [<c12b0000>] ? quirk_usb_early_handoff+0x4a9/0x522
>>>
>>> Here is some other possibly relevant info:
>>>
>>> # cat /proc/mdstat
>>> Personalities : [raid6] [raid5] [raid4]
>>> md0 : active raid6 sdh1[0] sdd1[9] sde1[10] sdb1[6] sdi1[7] sdc1[4]
>>> sdf1[3] sdg1[8] sdj1[1]
>>> 6837311488 blocks super 1.2 level 6, 512k chunk, algorithm 2
>>> [9/9]
>>> [UUUUUUUUU]
>>> [==========>..........] resync = 51.3% (501954432/976758784)
>>> finish=28755.6min speed=275K/sec
>>
>> Is this resync a weekly check, or did something else trigger it?
>
>
> This is not a scheduled check. It was triggered by, I believe, an unclean
> shutdown. An unclean shutdown will trigger a resync. I don't think it used
> to do this, but I could be remembering wrong.
>
>
>>
>>> unused devices:<none>
>>>
>>> # cat /proc/sys/dev/raid/speed_limit_min
>>> 10000
>>
>> MD is unable to reach its minimum rebuild rate while other system
>> activity is ongoing. You might want to lower this number to see if that
>> gets you out of the stalls.
>>
>> Or temporarily shut down mythtv.
>
>
> I will try lowering those numbers next time this happens, which will
> probably be within the next day or two. That's about how often this
> happens.
You might be interested in write intent bitmap then, it gonna help a lot.
(resending in plain text)
>
>
>>> # cat /proc/sys/dev/raid/speed_limit_max
>>> 200000
>>>
>>> Thanks in advance!
>>> -- Kevin
>>
>> HTH,
>>
>> Phil
>>
>> [1] http://github.com/pturmel/lsdrv
>>
>
> Thanks!
> -- Kevin
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html




--
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/