Re: [PATCH] raid6: fix the input of raid6 algorithm

From: liuzhengyuan
Date: Wed Aug 24 2016 - 03:59:08 EST


Oh, get_random_*() is really expensive. Thanks for your tips. The boot log on my aarch64 showed bellow
told it taked about 0.6 second to fill with disk data.

[ 0.172831] DMA: preallocated 256 KiB pool for atomic allocations
[ 0.788664] raid6: int64x1 gen() 121 MB/s
[ 0.856613] raid6: int64x1 xor() 74 MB/s
[ 0.924665] raid6: int64x2 gen() 166 MB/s
[ 0.992846] raid6: int64x2 xor() 95 MB/s
[ 1.060681] raid6: int64x4 gen() 290 MB/s
[ 1.128774] raid6: int64x4 xor() 160 MB/s
[ 1.196933] raid6: int64x8 gen() 238 MB/s
[ 1.264937] raid6: int64x8 xor() 148 MB/s
[ 1.332878] raid6: neonx1 gen() 256 MB/s
[ 1.400975] raid6: neonx1 xor() 130 MB/s
[ 1.468951] raid6: neonx2 gen() 333 MB/s
[ 1.537085] raid6: neonx2 xor() 181 MB/s
[ 1.605042] raid6: neonx4 gen() 451 MB/s
[ 1.673121] raid6: neonx4 xor() 289 MB/s
[ 1.741143] raid6: neonx8 gen() 452 MB/s
[ 1.809151] raid6: neonx8 xor() 277 MB/s
[ 1.809154] raid6: using algorithm neonx8 gen() 452 MB/s
[ 1.809157] raid6: .... xor() 277 MB/s, rmw enabled
[ 1.809160] raid6: using intx1 recovery algorithm

I replaced get_random_* with a local PRNG based on well-know
"linear congruential bit". The patch was like this:

+/* use the linear congruential bit. */
+static int32_t get_random_number_by_lcb(void)
+{
+ static int32_t seed = 1;
+ int32_t ret = 0;
+ ret = ((seed * 1103515245) + 12345) & 0x7fffffff;
+ seed = ret;
+ return ret;
+}

/* Try to pick the best algorithm */
/* This code uses the gfmul table as convenient data set to abuse */
@@ -229,8 +238,8 @@ int __init raid6_select_algo(void)
for (i = 0; i < disks-2; i++) {
dptrs[i] = disk_ptr + PAGE_SIZE*i;
- for (j = 0; j < PAGE_SIZE; j++)
- get_random_bytes(dptrs[i]+j, 1);
+ for (j = 0; j < PAGE_SIZE; j = j + 4)
+ *(int32_t *)(dptrs[i]+j) = get_random_number_by_lcb();
}

dptrs[disks-2] = disk_ptr + PAGE_SIZE*(disks-2);

The boot log with this patch was showd bellow, it taked about 0.08 second.

[ 0.172858] DMA: preallocated 256 KiB pool for atomic allocations
[ 0.256673] raid6: int64x1 gen() 121 MB/s
[ 0.324484] raid6: int64x1 xor() 73 MB/s
[ 0.392606] raid6: int64x2 gen() 166 MB/s
[ 0.460309] raid6: int64x2 xor() 92 MB/s
[ 0.528368] raid6: int64x4 gen() 290 MB/s
[ 0.596401] raid6: int64x4 xor() 156 MB/s
[ 0.664601] raid6: int64x8 gen() 238 MB/s
[ 0.732609] raid6: int64x8 xor() 148 MB/s
[ 0.800523] raid6: neonx1 gen() 256 MB/s
[ 0.868730] raid6: neonx1 xor() 129 MB/s
[ 0.936741] raid6: neonx2 gen() 334 MB/s
[ 1.004717] raid6: neonx2 xor() 202 MB/s
[ 1.072692] raid6: neonx4 gen() 451 MB/s
[ 1.140763] raid6: neonx4 xor() 260 MB/s
[ 1.208842] raid6: neonx8 gen() 452 MB/s
[ 1.276887] raid6: neonx8 xor() 277 MB/s
[ 1.276890] raid6: using algorithm neonx8 gen() 452 MB/s
[ 1.276894] raid6: .... xor() 277 MB/s, rmw enabled
[ 1.276897] raid6: using intx1 recovery algorithm
[ 1.276941] ACPI: Interpreter disabled.

I'm not familiar with spurious D$ conflicts and CPU cache behavior. How do you
think this PRNG or anything else I need to do?

------------------ Original ------------------
From: "H. Peter Anvin"<hpa@xxxxxxxxx>;
Date: Tue, Aug 23, 2016 11:53 AM
To: "liuzhengyuan"<liuzhengyuan@xxxxxxxxxx>;
Cc: "shli"<shli@xxxxxxxxxx>; "linux-raid"<linux-raid@xxxxxxxxxxxxxxx>; "fenghua.yu"<fenghua.yu@xxxxxxxxx>; "linux-kernel"<linux-kernel@xxxxxxxxxxxxxxx>; "liuzhengyuang521"<liuzhengyuang521@xxxxxxxxx>;
Subject: Re: [PATCH] raid6: fix the input of raid6 algorithm

Do you have any idea how long this takes to run? People are already complaining about the boot time penalty. get_random_*() is quite expensive and is overkill...
--
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.