On Tue, Jan 19, 2021 at 5:17 AM Adrian Ratiu <adrian.ratiu@xxxxxxxxxxxxx> wrote:
From: Nathan Chancellor <natechancellor@xxxxxxxxx>
Drop warning because kernel now requires GCC >= v4.9 after commit 6ec4476ac825 ("Raise gcc version requirement to 4.9") and clarify that -ftree-vectorize now always needs enabling for GCC by directly testing the presence of CONFIG_CC_IS_GCC.
Another reason to remove the warning is that Clang exposes itself as GCC < 4.6 so it triggers the warning about GCC which doesn't make much sense and misleads Clang users by telling them to update GCC.
Because Clang is now supported by the kernel print a clear Clang-specific warning.
Link: https://github.com/ClangBuiltLinux/linux/issues/496 Link: https://github.com/ClangBuiltLinux/linux/issues/503 Reported-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> Reviewed-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
This is not the version of the patch I had reviewed; please drop my reviewed-by tag when you change a patch significantly, as otherwise it looks like I approved this patch.
Nacked-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
Signed-off-by: Nathan Chancellor <natechancellor@xxxxxxxxx>
Signed-off-by: Adrian Ratiu <adrian.ratiu@xxxxxxxxxxxxx>
arch/arm/lib/xor-neon.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..f9f3601cc2d1 100644
@@ -14,20 +14,22 @@ MODULE_LICENSE("GPL");
#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
+ * TODO: Even though -ftree-vectorize is enabled by default in Clang, the
+ * compiler does not produce vectorized code due to its cost model.
+ * See: https://github.com/ClangBuiltLinux/linux/issues/503
+#warning Clang does not vectorize code in this file.
Arnd, remind me again why it's a bug that the compiler's cost model
says it's faster to not produce a vectorized version of these loops?
I stand by my previous comment: https://bugs.llvm.org/show_bug.cgi?id=40976#c8
* Pull in the reference implementations while instructing GCC (through
* -ftree-vectorize) to attempt to exploit implicit parallelism and emit
* NEON instructions.
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
#pragma GCC optimize "tree-vectorize"
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
-#warning This code requires at least version 4.6 of GCC
#pragma GCC diagnostic ignored "-Wunused-variable"