[PATCH next 0/3] lib: Implement mul_u64_u64_div_u64_roundup()
From: David Laight
Date: Sat Apr 05 2025 - 16:45:55 EST
The pwm-stm32.c code wants a 'rounding up' version of mul_u64_u64_div_u64().
This can be done simply by adding 'divisor - 1' to the 128bit product.
Implement mul_u64_add_u64_div_u64(a, b, c, d) = (a * b + c)/d based on the
existing code.
Define mul_u64_u64_div_u64(a, b, d) as mul_u64_add_u64_div_u64(a, b, 0, d) and
mul_u64_u64_div_u64_roundup(a, b, d) as mul_u64_add_u64_div_u64(a, b, d-1, d).
Only x86-64 has an optimsed (asm) version of the function.
That is optimised to avoid the 'add c' when c is known to be zero.
In all other cases the extra code will be noise compared to the software
divide code.
I've updated the test module to test mul_u64_u64_div_u64_roundup() and
also enhanced it to verify the C division code on x86-64.
Note that the code generated by gcc (eg for 32bit x86) just for the multiply
is rather more horrid than one would expect (clang does better).
I dread to think how long the divide loop takes.
And I'm not at all sure the call in kernel/sched/cputime.c isn't in a
relatively common path (rather than just hardware initialisation).
David Laight (3):
lib: Add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup()
lib: Add tests for mul_u64_u64_div_u64_roundup()
lib: Update the muldiv64 tests to verify the C on x86-64
arch/x86/include/asm/div64.h | 19 ++--
include/linux/math64.h | 44 ++++++++-
lib/math/div64.c | 57 ++++++++----
lib/math/test_mul_u64_u64_div_u64.c | 136 +++++++++++++++++-----------
4 files changed, 179 insertions(+), 77 deletions(-)
--
2.39.5