[PATCH v2 0/4] do_div() with constant divisor simplification
From: Nicolas Pitre
Date: Sun Jul 07 2024 - 13:20:14 EST
While working on mul_u64_u64_div_u64() improvements I realized that there
is a better way to perform a 64x64->128 bits multiplication with overflow
handling. This is not as lean as v1 of the series but still much better
than the existing code IMHO.
Changes from v1:
- Formalize condition for when overflow handling can be skipped.
- Make this condition apply only if it can be determined at compile time
(beware of the compiler not always inling code).
- Keep the ARM assembly but apply the above changes to it as well.
- Simplify generic C code for __arch_xprod64() further.
- Force __always_inline when optimizing for performance.
- Augment test_div64.c with important edge cases.
Link to v1: https://lore.kernel.org/lkml/20240705022334.1378363-1-nico@xxxxxxxxxxx/
The diffstat is:
arch/arm/include/asm/div64.h | 13 +++-
include/asm-generic/div64.h | 121 ++++++++++++-----------------------
lib/math/test_div64.c | 85 +++++++++++++++++++++++-
3 files changed, 134 insertions(+), 85 deletions(-)