Hey Prarit,
That's a possiblity, although I suspect that this has more to do with notOne possibility is that if the cpu we're doing our timekeeping
accumulation on is different then the one running the test, we might
go into deeper idle for longer periods of time. Then when we
accumulate time, we have more then a single tick to accumulate and
that might require holding the timekeeper/xtime lock for longer
times.
And the max 2.9ns variance seems particularly low, given that we do
call update_vsyscall every so often, and that should block
clock_gettime() callers while we update the vsyscall data. Could it
be that the test is too short to see the locking effect, so you're
just getting lucky, and that adding nohz is jostling the regularity
of the execution so you then see the lock wait times? If you
increase the samples and sample loops by 1000 does that change the
behavior?
executing the RCU NOHZ code given that we don't see a problem with the
clock_gettime() vs clock_gettime() test. I wonder if not executing the RCU NOHZ
code somehow introduces a "regularity" with execution that results in the CPU
always being in C0/polling when the test is run?