Re: [PATCH net-next 1/5] timecounter: Add helper for reconstructing partial timestamps

From: John Stultz
Date: Tue May 28 2019 - 22:18:03 EST


On Tue, May 28, 2019 at 4:58 PM Vladimir Oltean <olteanv@xxxxxxxxx> wrote:
>
> Some PTP hardware offers a 64-bit free-running counter whose snapshots
> are used for timestamping, but only makes part of that snapshot
> available as timestamps (low-order bits).
>
> In that case, timecounter/cyclecounter users must bring the cyclecounter
> and timestamps to the same bit width, and they currently have two
> options of doing so:
>
> - Trim the higher bits of the timecounter itself to the number of bits
> of the timestamps. This might work for some setups, but if the
> wraparound of the timecounter in this case becomes high (~10 times per
> second) then this causes additional strain on the system, which must
> read the clock that often just to avoid missing the wraparounds.
>
> - Reconstruct the timestamp by racing to read the PTP time within one
> wraparound cycle since the timestamp was generated. This is
> preferable when the wraparound time is small (do a time-critical
> readout once vs doing it periodically), and it has no drawback even
> when the wraparound is comfortably sized.
>
> Signed-off-by: Vladimir Oltean <olteanv@xxxxxxxxx>
> ---
> include/linux/timecounter.h | 7 +++++++
> kernel/time/timecounter.c | 33 +++++++++++++++++++++++++++++++++
> 2 files changed, 40 insertions(+)
>
> diff --git a/include/linux/timecounter.h b/include/linux/timecounter.h
> index 2496ad4cfc99..03eab1f3bb9c 100644
> --- a/include/linux/timecounter.h
> +++ b/include/linux/timecounter.h
> @@ -30,6 +30,9 @@
> * by the implementor and user of specific instances of this API.
> *
> * @read: returns the current cycle value
> + * @partial_tstamp_mask:bitmask in case the hardware emits timestamps
> + * which only capture low-order bits of the full
> + * counter, and should be reconstructed.
> * @mask: bitmask for two's complement
> * subtraction of non 64 bit counters,
> * see CYCLECOUNTER_MASK() helper macro
> @@ -38,6 +41,7 @@
> */
> struct cyclecounter {
> u64 (*read)(const struct cyclecounter *cc);
> + u64 partial_tstamp_mask;
> u64 mask;
> u32 mult;
> u32 shift;
> @@ -136,4 +140,7 @@ extern u64 timecounter_read(struct timecounter *tc);
> extern u64 timecounter_cyc2time(struct timecounter *tc,
> u64 cycle_tstamp);
>
> +extern u64 cyclecounter_reconstruct(const struct cyclecounter *cc,
> + u64 ts_partial);
> +
> #endif
> diff --git a/kernel/time/timecounter.c b/kernel/time/timecounter.c
> index 85b98e727306..d4657d64e38d 100644
> --- a/kernel/time/timecounter.c
> +++ b/kernel/time/timecounter.c
> @@ -97,3 +97,36 @@ u64 timecounter_cyc2time(struct timecounter *tc,
> return nsec;
> }
> EXPORT_SYMBOL_GPL(timecounter_cyc2time);
> +
> +/**
> + * cyclecounter_reconstruct - reconstructs @ts_partial
> + * @cc: Pointer to cycle counter.
> + * @ts_partial: Typically RX or TX NIC timestamp, provided by hardware as
> + * the lower @partial_tstamp_mask bits of the cycle counter,
> + * sampled at the time the timestamp was collected.
> + * To reconstruct into a full @mask bit-wide timestamp, the
> + * cycle counter is read and the high-order bits (up to @mask) are
> + * filled in.
> + * Must be called within one wraparound of @partial_tstamp_mask
> + * bits of the cycle counter.
> + */
> +u64 cyclecounter_reconstruct(const struct cyclecounter *cc, u64 ts_partial)
> +{
> + u64 ts_reconstructed;
> + u64 cycle_now;
> +
> + cycle_now = cc->read(cc);
> +
> + ts_reconstructed = (cycle_now & ~cc->partial_tstamp_mask) |
> + ts_partial;
> +
> + /* Check lower bits of current cycle counter against the timestamp.
> + * If the current cycle counter is lower than the partial timestamp,
> + * then wraparound surely occurred and must be accounted for.
> + */
> + if ((cycle_now & cc->partial_tstamp_mask) <= ts_partial)
> + ts_reconstructed -= (cc->partial_tstamp_mask + 1);
> +
> + return ts_reconstructed;
> +}
> +EXPORT_SYMBOL_GPL(cyclecounter_reconstruct);

Hrm. Is this actually generic? Would it make more sense to have the
specific implementations with this quirk implement this in their
read() handler? If not, why?

thanks
-john