Re: [PATCH v2] timekeeping: move multigrain timestamp floor handling into timekeeper
From: Jeff Layton
Date: Fri Sep 13 2024 - 08:01:43 EST
On Fri, 2024-09-13 at 13:26 +0200, Jan Kara wrote:
> On Thu 12-09-24 14:02:52, Jeff Layton wrote:
> > The kernel test robot reported a performance hit in some will-it-scale
> > tests due to the multigrain timestamp patches. My own testing showed
> > about a 7% drop in performance on the pipe1_threads test, and the data
> > showed that coarse_ctime() was slowing down current_time().
> >
> > Move the multigrain timestamp floor tracking word into timekeeper.c. Add
> > two new public interfaces: The first fills a timespec64 with the later
> > of the coarse-grained clock and the floor time, and the second gets a
> > fine-grained time and tries to swap it into the floor and fills a
> > timespec64 with the result.
> >
> > The first function returns an opaque cookie that is suitable for passing
> > to the second, which will use it as the "old" value in the cmpxchg.
> >
> > With this patch on top of the multigrain series, the will-it-scale
> > pipe1_threads microbenchmark shows these averages on my test rig:
> >
> > v6.11-rc7: 103561295 (baseline)
> > v6.11-rc7 + mgtime + this: 101357203 (~2% performance drop)
> >
> > Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> > Closes: https://lore.kernel.org/oe-lkp/202409091303.31b2b713-oliver.sang@xxxxxxxxx
> > Suggested-by: Arnd Bergmann <arnd@xxxxxxxxxx>
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
>
> One question regarding the cookie handling as well :)
>
> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> > index 5391e4167d60..bb039c9d525e 100644
> > --- a/kernel/time/timekeeping.c
> > +++ b/kernel/time/timekeeping.c
> > @@ -114,6 +114,13 @@ static struct tk_fast tk_fast_raw ____cacheline_aligned = {
> > .base[1] = FAST_TK_INIT,
> > };
> >
> > +/*
> > + * This represents the latest fine-grained time that we have handed out as a
> > + * timestamp on the system. Tracked as a monotonic ktime_t, and converted to the
> > + * realtime clock on an as-needed basis.
> > + */
> > +static __cacheline_aligned_in_smp atomic64_t mg_floor;
> > +
> > static inline void tk_normalize_xtime(struct timekeeper *tk)
> > {
> > while (tk->tkr_mono.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr_mono.shift)) {
> > @@ -2394,6 +2401,76 @@ void ktime_get_coarse_real_ts64(struct timespec64 *ts)
> > }
> > EXPORT_SYMBOL(ktime_get_coarse_real_ts64);
> >
> > +/**
> > + * ktime_get_coarse_real_ts64_mg - get later of coarse grained time or floor
> > + * @ts: timespec64 to be filled
> > + *
> > + * Adjust floor to realtime and compare it to the coarse time. Fill
> > + * @ts with the latest one. Returns opaque cookie suitable to pass
> > + * to ktime_get_real_ts64_mg.
> > + */
> > +u64 ktime_get_coarse_real_ts64_mg(struct timespec64 *ts)
> > +{
> > + struct timekeeper *tk = &tk_core.timekeeper;
> > + u64 floor = atomic64_read(&mg_floor);
> > + ktime_t f_real, offset, coarse;
> > + unsigned int seq;
> > +
> > + WARN_ON(timekeeping_suspended);
> > +
> > + do {
> > + seq = read_seqcount_begin(&tk_core.seq);
> > + *ts = tk_xtime(tk);
> > + offset = *offsets[TK_OFFS_REAL];
> > + } while (read_seqcount_retry(&tk_core.seq, seq));
> > +
> > + coarse = timespec64_to_ktime(*ts);
> > + f_real = ktime_add(floor, offset);
> > + if (ktime_after(f_real, coarse))
> > + *ts = ktime_to_timespec64(f_real);
> > + return floor;
> > +}
> > +EXPORT_SYMBOL_GPL(ktime_get_coarse_real_ts64_mg);
> > +
> > +/**
> > + * ktime_get_real_ts64_mg - attempt to update floor value and return result
> > + * @ts: pointer to the timespec to be set
> > + * @cookie: opaque cookie from earlier call to ktime_get_coarse_real_ts64_mg()
> > + *
> > + * Get a current monotonic fine-grained time value and attempt to swap
> > + * it into the floor using @cookie as the "old" value. @ts will be
> > + * filled with the resulting floor value, regardless of the outcome of
> > + * the swap.
> > + */
> > +void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie)
> > +{
> > + struct timekeeper *tk = &tk_core.timekeeper;
> > + ktime_t offset, mono, old = (ktime_t)cookie;
> > + unsigned int seq;
> > + u64 nsecs;
>
> So what would be the difference if we did instead:
>
> old = atomic64_read(&mg_floor);
>
> and not bother with the cookie? AFAIU this could result in somewhat more
> updates to mg_floor (the contention on the mg_floor cacheline would be the
> same but there would be more invalidates of the cacheline). OTOH these
> updates can happen only if max(current_coarse_time, mg_floor) ==
> inode->i_ctime which is presumably rare? What is your concern that I'm
> missing?
>
My main concern is the "somewhat more updates to mg_floor". mg_floor is
a global variable, so one of my main goals is to minimize the updates
to it. There is no correctness issue in doing what you're saying above
(AFAICT anyway), but the window of time between when we fetch the
current floor and try to do the swap will be smaller, and we'll end up
doing more swaps as a result.
Do you have any objection to adding the cookie to this API?
>
> > +
> > + WARN_ON(timekeeping_suspended);
> > +
> > + do {
> > + seq = read_seqcount_begin(&tk_core.seq);
> > +
> > + ts->tv_sec = tk->xtime_sec;
> > + mono = tk->tkr_mono.base;
> > + nsecs = timekeeping_get_ns(&tk->tkr_mono);
> > + offset = *offsets[TK_OFFS_REAL];
> > + } while (read_seqcount_retry(&tk_core.seq, seq));
> > +
> > + mono = ktime_add_ns(mono, nsecs);
> > + if (atomic64_try_cmpxchg(&mg_floor, &old, mono)) {
> > + ts->tv_nsec = 0;
> > + timespec64_add_ns(ts, nsecs);
> > + } else {
> > + *ts = ktime_to_timespec64(ktime_add(old, offset));
> > + }
> > +
> > +}
> > +EXPORT_SYMBOL(ktime_get_real_ts64_mg);
> > +
--
Jeff Layton <jlayton@xxxxxxxxxx>