Subject: [patch] sys_time() speedup
From: Ingo Molnar <mingo@xxxxxxx>
improve performance of sys_time(). sys_time() returns time in seconds, but it does so by calling do_gettimeofday() and then returning the tv_sec portion of the GTOD time. But the data structure "xtime", which is updated by every timer/scheduler tick, already offers HZ granularity time.
the patch improves the sysbench OLTP macrobenchmark significantly:
2.6.22-rc6:
#threads
1: transactions: 3733 (373.21 per sec.)
2: transactions: 6676 (667.46 per sec.)
3: transactions: 6957 (695.50 per sec.)
4: transactions: 7055 (705.48 per sec.)
5: transactions: 6596 (659.33 per sec.)
2.6.22-rc6 + sys_time.patch:
1: transactions: 4005 (400.47 per sec.)
2: transactions: 7379 (737.77 per sec.)
3: transactions: 7347 (734.49 per sec.)
4: transactions: 7468 (746.65 per sec.)
5: transactions: 7428 (742.47 per sec.)
mixed API uses of gettimeofday() and time() are guaranteed to be coherent via the use of a at-most-once-per-second slowpath that updates xtime.
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
kernel/time.c | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)
Index: linux/kernel/time.c
===================================================================
--- linux.orig/kernel/time.c
+++ linux/kernel/time.c
@@ -57,14 +57,17 @@ EXPORT_SYMBOL(sys_tz);
*/
asmlinkage long sys_time(time_t __user * tloc)
{
- time_t i;
- struct timeval tv;
+ /*
+ * We read xtime.tv_sec atomically - it's updated
+ * atomically by update_wall_time(), so no need to
+ * even read-lock the xtime seqlock:
+ */
+ time_t i = xtime.tv_sec;
- do_gettimeofday(&tv);
- i = tv.tv_sec;
+ smp_rmb(); /* sys_time() results are coherent */
if (tloc) {
- if (put_user(i,tloc))
+ if (put_user(i, tloc))
i = -EFAULT;
}
return i;
@@ -373,6 +376,20 @@ void do_gettimeofday (struct timeval *tv
tv->tv_sec = sec;
tv->tv_usec = usec;
+
+ /*
+ * Make sure xtime.tv_sec [returned by sys_time()] always
+ * follows the gettimeofday() result precisely. This
+ * condition is extremely unlikely, it can hit at most
+ * once per second:
+ */