Re: [PATCH v2] tile: avoid using clocksource_cyc2ns with absolute cycle count

From: Peter Zijlstra
Date: Thu Nov 17 2016 - 04:53:58 EST


On Wed, Nov 16, 2016 at 03:16:59PM -0500, Chris Metcalf wrote:
> PeterZ (cc'ed) then improved it to use __int128 math via
> mul_u64_u32_shr(), but that doesn't help tile; we only do one multiply
> instead of two, but the multiply is handled by an out-of-line call to
> __multi3, and the sched_clock() function ends up about 2.5x slower as
> a result.

Well, only if you set CONFIG_ARCH_SUPPORTS_INT128, otherwise it reduces
to 2 32x23->64 multiplications, of which one if conditional on there
actually being bits set in the high word of the u64 argument.