On Thu, Feb 29, 2024 at 09:19:59PM -0500, Stefan Berger wrote:
+static void vli_mmod_fast_521(u64 *result, const u64 *product,
+ const u64 *curve_prime, u64 *tmp)
+{
+ const unsigned int ndigits = 9;
+ size_t i;
+
+ for (i = 0; i < ndigits; i++)
+ tmp[i] = product[i];
+ tmp[8] &= 0x1ff;
Hm, the other vli_mmod_fast_*() functions manually unroll those loops.
Wondering if that would make sense here as well? It's also possible
to tell gcc to unroll a loop with a per-function...
__attribute__((optimize("unroll-loops")))
...but I'm not sure about clang portability. >
@@ -941,6 +966,12 @@ static bool vli_mmod_fast(u64 *result, u64 *product,
+ case 9:
+ if (!strcmp(curve->name, "nist_521")) {
+ vli_mmod_fast_521(result, product, curve_prime, tmp);
+ break;
+ }
+ fallthrough;
If you reorder patch 4 and 5, you could check for curve->nbits == 521 here,
which might be cheaper than the string comparison.
-#define ECC_MAX_DIGITS (512 / 64) /* due to ecrdsa */
+#define ECC_MAX_DIGITS (576 / 64) /* due to NIST P521 */
Maybe DIV_ROUND_UP(521, 64) for clarity?
Thanks,
Lukas