Re: [PATCH 1/4] lib: vsprintf: Optimize division by 10 for small integers.

From: George Spelvin
Date: Mon Sep 24 2012 - 07:27:05 EST


>> +/* See comment in put_dec_full9 for choice of constants */
>> static noinline_for_stack
>> char *put_dec_full4(char *buf, unsigned q)
>> {
>> unsigned r;
>> - r = (q * 0xcccd) >> 19;
>> + r = (q * 0xccd) >> 15;
>> *buf++ = (q - 10 * r) + '0';
>> - q = (r * 0x199a) >> 16;
>> + q = (r * 0xcd) >> 11;

> I would use 16-bit shifts instead of smaller ones.
> There may be CPUs on which "get upper half of 32-bit reg"
> operation is cheaper or smaller than a shift.

Good point, but wouldn't those CPUs *also* have multi-cycle multiply,
or have to synthesize it out of shift-and-add, in which case smaller
constants would save even more cycles?

I'm thinking original MC68010 here, which I'm not sure is even
meaningful any more. ColdFire has single-cycle shifts.

Can you think of a processor where that would actually be
an improvement?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/