Re: [PATCH] clocksource_cyc2ns: avoid overflowing 64 bits

From: Chris Metcalf
Date: Wed Nov 16 2016 - 16:05:44 EST


On 11/16/2016 1:04 PM, John Stultz wrote:
On Wed, Nov 16, 2016 at 8:57 AM, Chris Metcalf <cmetcalf@xxxxxxxxxxxx> wrote:
include/linux/clocksource.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 08398182f56e..b2a022acf232 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -175,7 +175,7 @@ static inline u32 clocksource_hz2mult(u32 hz, u32 shift_constant)
*/
static inline s64 clocksource_cyc2ns(cycle_t cycles, u32 mult, u32 shift)
{
- return ((u64) cycles * mult) >> shift;
+ return mult_frac(cycles, mult, 1ULL << shift);
}

So clocksource_cyc2ns() was never intended to be used with
indefinitely large cycle values, and it looks like tile and blackfin
are abusing the interface (the omap usage provide cycle deltas rather
then just the current counter value).

Well, the interface does just say "convert clocksource cycles to nanoseconds". :-)
If you think it's more important that it be a little faster, we should adjust the
documentation to say it is only appropriate for delta-cycles, not absolute cycles.
I've appended a commit that does this if you'd like to take it.

I'd suggest instead to move tile/blackfin to using the generic
sched_clock implementation that most of the architectures use, or
special case the code in the arch specific sched_clock
implementations(as x86 does) instead of modifying the common interface
to better handle a use case its not intended for.

Yes, since tile has a full 64-bit cycle counter, the best thing is to just directly
open-code the mult_frac() in tile's sched_clock(). I'll push that change.
Steven Miao, I assume you should do the same for blackfin.

Thanks!

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

From: Chris Metcalf <cmetcalf@xxxxxxxxxxxx>
Date: Wed, 16 Nov 2016 14:24:53 -0500
Subject: [PATCH] clocksource_cyc2ns: document intended range limitation

The "cycles" argument should not be an absolute clocksource cycle
value, as the implementation's arithmetic will overflow relatively
easily. For performance, the implementation is simple and fast,
since the function is intended for only relatively small delta
values of clocksource cycles.

Signed-off-by: Chris Metcalf <cmetcalf@xxxxxxxxxxxx>
---
include/linux/clocksource.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 08398182f56e..5444429884b8 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -171,6 +171,10 @@ static inline u32 clocksource_hz2mult(u32 hz, u32 shift_constant)
*
* Converts cycles to nanoseconds, using the given mult and shift.
*
+ * The code is optimized for performance and not intended to work
+ * with absolute clocksource cycles, as it will easily overflow,
+ * but just intended for relative (delta) clocksource cycles.
+ *
* XXX - This could use some mult_lxl_ll() asm optimization
*/
static inline s64 clocksource_cyc2ns(cycle_t cycles, u32 mult, u32 shift)
--
2.7.2