Re: [PATCH v2] raid6: arm64: add SVE optimized implementation for syndrome generation

Next message: Austin Kim: "[PATCH] riscv: mm: refactor by introducing trace_page_fault_entries()"
Previous message: kernel test robot: "lib/tests/test_kprobes.c:23:24: warning: variable 'stacktrace_driver' set but not used"
In reply to: Demian Shulhan: "Re: [PATCH v2] raid6: arm64: add SVE optimized implementation for syndrome generation"
Next in thread: Ard Biesheuvel: "Re: [PATCH v2] raid6: arm64: add SVE optimized implementation for syndrome generation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Christoph Hellwig

Date: Mon Mar 30 2026 - 01:30:33 EST

On Sun, Mar 29, 2026 at 04:01:06PM +0300, Demian Shulhan wrote:
> Furthermore, as Christoph suggested, I tested scalability on wider
> arrays since the default kernel benchmark is hardcoded to 8 disks,
> which doesn't give the unrolled SVE loop enough data to shine. On a
> 16-disk array, svex4 hits 15.1 GB/s compared to 8.0 GB/s for neonx4.
> On a 24-disk array, while neonx4 chokes and drops to 7.8 GB/s, svex4
> maintains a stable 15.0 GB/s — effectively doubling the throughput.I
> agree this patch should be put on hold for now. My intention is to
> leave these numbers here as evidence that implementing SVE context
> preservation in the kernel (the "good use case") is highly justifiable
> from both a power-efficiency and a wide-array throughput perspective
> for modern ARM64 hardware.
>
> Thanks again for your time and time and review!

To me this sounds like an interesting case for a SVE kernel API.
But I'm not relly knowledgeable enough to provide one to help
with testing this further.