[RFC] net: dsa: mv88e6xxx: ptp: improve phc2sys precision for mv88e6xxx switch in combination with imx6-fec

From: Hubert Feurstein
Date: Fri Aug 02 2019 - 12:33:10 EST


With this patch the phc2sys synchronisation precision improved to +/-500ns.

Signed-off-by: Hubert Feurstein <h.feurstein@xxxxxxxxx>
---

This patch should only be the base for a discussion about improving precision of
phc2sys (from the linuxptp package) in combination with a mv88e6xxx switch and
imx6-fec.

When I started my work on PTP at the beginning of this week, I was positively
supprised about the sync precision of ptp4l. After adding support for the
MV88E6220 I was able to synchronize two of our boards within +/- 10 nanoseconds.
Remebering that the PTP system in the MV88E6220 is clocked with 100MHz, this is
I think the best what can be expected. Big thanks to Richard and the other
developers who made this possible.

But then I tried to synchornize the PTP clock with the system clock by using
phc2sys (phc2sys -rr -amq -l6) and I quickly was very disapointed about the
precision.

Min: -17829 ns
Max: 21694 ns
StdDev: 8520 ns
Count: 127

So I tried to find the reason for this. And as you probably already know, the
reason for that is the MDIO latency, which is terrible (up to 800 usecs).

As a next step, I tried to to implement the gettimex64 callback (see: "[PATCH]
net: dsa: mv88e6xxx: extend PTP gettime function to read system clock"). With
this in place (and a patched linuxptp-master version which really uses the
PTP_SYS_OFFSET_EXTENDED-ioctl), I got the following results:

Min: -12144 ns
Max: 10891 ns
StdDev: 4046,71 ns
Count: 112

So, things improved, but this is still unacceptable. It was still not possible
to compensate the MDIO latency issue.

According to my understanding, the timestamps (by using
ptp_read_system_{pre|post}ts) have to be captured at a place where we have an
constant offset related to the PHC in the switch. The only point where these
timestamps can be captured is the mdio_write callback in the imx_fec. Because,
reading the PHC timestamp will result in the follwing MDIO accesses:

(several) reads of the AVB_CMD register (to poll for the busy-flag)
write AVB_CMD (AVBOp=110b Read with post-incerement of PHC timestamp)
read AVB_DATA (PTP Global Time [15:0])
read AVB_DATA (PTP Global Time [31:16])

With this sequence in mind, the Marvell switch has to snapshot the PHC
timestamp at the write-AVB_CMD in order to be able to get sane values later by
reading AVB_DATA. So the best place to capture the system timestamps is this
one and only write operation directly in the imx_fec. By using the patch below
(without the changes to the system clock resolution) I got the following
results:

Min: -464 ns
Max: 525 ns
StdDev: 210,31 ns
Count: 401

I would say that is a huge improvement.

I realized, that the system clock (at least on the imx6) has a resolution of
333ns. So I tried to speed up this clock by using the PER-clock instead of
OSC_PER. This gave me 15ns resolution. The results were:

Min: -476 ns
Max: 439 ns
StdDev: 176,52 ns
Count: 630

So, things got improved again a little bit (at least the StdDev).

According to my understanding, this is almost the best which is possible,
because there is one more clock which influences the results. This is the MDIO
bus clock, which is just 2.5MHz (or 400ns). So, I would say that +/- 400ns
jitter is caused only by the MDIO bus clock, since the changes in imx_fec should
not introduce any unpredictable latencies.

My question to the experienced kernel developers is, how can this be implemented
in a more generic way? Because this hack only works under these circumstances,
and of course can never be part of the mainline kernel.

I am not 100% sure that all my statements or assumptions are correct, so feel
free to correct me.

Hubert

drivers/clocksource/timer-imx-gpt.c | 9 ++++++++-
drivers/net/dsa/mv88e6xxx/chip.h | 2 ++
drivers/net/dsa/mv88e6xxx/ptp.c | 11 +++++++----
drivers/net/dsa/mv88e6xxx/smi.c | 2 +-
drivers/net/ethernet/freescale/fec_main.c | 24 ++++++++++++++++++-----
drivers/net/phy/mdio_bus.c | 16 +++++++++++++++
include/linux/mdio.h | 2 ++
include/linux/phy.h | 1 +
8 files changed, 56 insertions(+), 11 deletions(-)

diff --git a/drivers/clocksource/timer-imx-gpt.c b/drivers/clocksource/timer-imx-gpt.c
index 706c0d0ff56c..84695a2d8ff7 100644
--- a/drivers/clocksource/timer-imx-gpt.c
+++ b/drivers/clocksource/timer-imx-gpt.c
@@ -471,8 +471,15 @@ static int __init mxc_timer_init_dt(struct device_node *np, enum imx_gpt_type t

/* Try osc_per first, and fall back to per otherwise */
imxtm->clk_per = of_clk_get_by_name(np, "osc_per");
- if (IS_ERR(imxtm->clk_per))
+
+ /* Force PER clock to be used, so we get 15ns instead of 333ns */
+ //if (IS_ERR(imxtm->clk_per)) {
+ if (1) {
imxtm->clk_per = of_clk_get_by_name(np, "per");
+ pr_info("==> Using PER clock\n");
+ } else {
+ pr_info("==> Using OSC_PER clock\n");
+ }

imxtm->type = type;

diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index 01963ee94c50..9e14dc406415 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -277,6 +277,8 @@ struct mv88e6xxx_chip {
struct ptp_clock_info ptp_clock_info;
struct delayed_work tai_event_work;
struct ptp_pin_desc pin_config[MV88E6XXX_MAX_GPIO];
+ struct ptp_system_timestamp *ptp_sts;
+
u16 trig_config;
u16 evcap_config;
u16 enable_count;
diff --git a/drivers/net/dsa/mv88e6xxx/ptp.c b/drivers/net/dsa/mv88e6xxx/ptp.c
index 073cbd0bb91b..cf6e52ee9e0a 100644
--- a/drivers/net/dsa/mv88e6xxx/ptp.c
+++ b/drivers/net/dsa/mv88e6xxx/ptp.c
@@ -235,14 +235,17 @@ static int mv88e6xxx_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
return 0;
}

-static int mv88e6xxx_ptp_gettime(struct ptp_clock_info *ptp,
- struct timespec64 *ts)
+static int mv88e6xxx_ptp_gettimex(struct ptp_clock_info *ptp,
+ struct timespec64 *ts,
+ struct ptp_system_timestamp *sts)
{
struct mv88e6xxx_chip *chip = ptp_to_chip(ptp);
u64 ns;

mv88e6xxx_reg_lock(chip);
+ chip->ptp_sts = sts;
ns = timecounter_read(&chip->tstamp_tc);
+ chip->ptp_sts = NULL;
mv88e6xxx_reg_unlock(chip);

*ts = ns_to_timespec64(ns);
@@ -426,7 +429,7 @@ static void mv88e6xxx_ptp_overflow_check(struct work_struct *work)
struct mv88e6xxx_chip *chip = dw_overflow_to_chip(dw);
struct timespec64 ts;

- mv88e6xxx_ptp_gettime(&chip->ptp_clock_info, &ts);
+ mv88e6xxx_ptp_gettimex(&chip->ptp_clock_info, &ts, NULL);

schedule_delayed_work(&chip->overflow_work,
MV88E6XXX_TAI_OVERFLOW_PERIOD);
@@ -472,7 +475,7 @@ int mv88e6xxx_ptp_setup(struct mv88e6xxx_chip *chip)
chip->ptp_clock_info.max_adj = MV88E6XXX_MAX_ADJ_PPB;
chip->ptp_clock_info.adjfine = mv88e6xxx_ptp_adjfine;
chip->ptp_clock_info.adjtime = mv88e6xxx_ptp_adjtime;
- chip->ptp_clock_info.gettime64 = mv88e6xxx_ptp_gettime;
+ chip->ptp_clock_info.gettimex64 = mv88e6xxx_ptp_gettimex;
chip->ptp_clock_info.settime64 = mv88e6xxx_ptp_settime;
chip->ptp_clock_info.enable = ptp_ops->ptp_enable;
chip->ptp_clock_info.verify = ptp_ops->ptp_verify;
diff --git a/drivers/net/dsa/mv88e6xxx/smi.c b/drivers/net/dsa/mv88e6xxx/smi.c
index 5fc78a063843..801fd4abba5a 100644
--- a/drivers/net/dsa/mv88e6xxx/smi.c
+++ b/drivers/net/dsa/mv88e6xxx/smi.c
@@ -45,7 +45,7 @@ static int mv88e6xxx_smi_direct_write(struct mv88e6xxx_chip *chip,
{
int ret;

- ret = mdiobus_write_nested(chip->bus, dev, reg, data);
+ ret = mdiobus_write_nested_ptp(chip->bus, dev, reg, data, chip->ptp_sts);
if (ret < 0)
return ret;

diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 2f6057e7335d..20f589dc5b8b 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -1814,11 +1814,25 @@ static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum,

reinit_completion(&fep->mdio_done);

- /* start a write op */
- writel(FEC_MMFR_ST | FEC_MMFR_OP_WRITE |
- FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) |
- FEC_MMFR_TA | FEC_MMFR_DATA(value),
- fep->hwp + FEC_MII_DATA);
+ if (bus->ptp_sts) {
+ unsigned long flags = 0;
+ local_irq_save(flags);
+ __iowmb();
+ /* >Take the timestamp *after* the memory barrier */
+ ptp_read_system_prets(bus->ptp_sts);
+ writel_relaxed(FEC_MMFR_ST | FEC_MMFR_OP_WRITE |
+ FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) |
+ FEC_MMFR_TA | FEC_MMFR_DATA(value),
+ fep->hwp + FEC_MII_DATA);
+ ptp_read_system_postts(bus->ptp_sts);
+ local_irq_restore(flags);
+ } else {
+ /* start a write op */
+ writel(FEC_MMFR_ST | FEC_MMFR_OP_WRITE |
+ FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(regnum) |
+ FEC_MMFR_TA | FEC_MMFR_DATA(value),
+ fep->hwp + FEC_MII_DATA);
+ }

/* wait for end of transfer */
time_left = wait_for_completion_timeout(&fep->mdio_done,
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index bd04fe762056..f076487db11f 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -672,6 +672,22 @@ int mdiobus_write_nested(struct mii_bus *bus, int addr, u32 regnum, u16 val)
}
EXPORT_SYMBOL(mdiobus_write_nested);

+int mdiobus_write_nested_ptp(struct mii_bus *bus, int addr, u32 regnum, u16 val, struct ptp_system_timestamp *ptp_sts)
+{
+ int err;
+
+ BUG_ON(in_interrupt());
+
+ mutex_lock_nested(&bus->mdio_lock, MDIO_MUTEX_NESTED);
+ bus->ptp_sts = ptp_sts;
+ err = __mdiobus_write(bus, addr, regnum, val);
+ bus->ptp_sts = NULL;
+ mutex_unlock(&bus->mdio_lock);
+
+ return err;
+}
+EXPORT_SYMBOL(mdiobus_write_nested_ptp);
+
/**
* mdiobus_write - Convenience function for writing a given MII mgmt register
* @bus: the mii_bus struct
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index e8242ad88c81..7f9767babdc3 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -9,6 +9,7 @@
#include <uapi/linux/mdio.h>
#include <linux/mod_devicetable.h>

+struct ptp_system_timestamp;
struct gpio_desc;
struct mii_bus;

@@ -310,6 +311,7 @@ int mdiobus_read(struct mii_bus *bus, int addr, u32 regnum);
int mdiobus_read_nested(struct mii_bus *bus, int addr, u32 regnum);
int mdiobus_write(struct mii_bus *bus, int addr, u32 regnum, u16 val);
int mdiobus_write_nested(struct mii_bus *bus, int addr, u32 regnum, u16 val);
+int mdiobus_write_nested_ptp(struct mii_bus *bus, int addr, u32 regnum, u16 val, struct ptp_system_timestamp *ptp_sts);

int mdiobus_register_device(struct mdio_device *mdiodev);
int mdiobus_unregister_device(struct mdio_device *mdiodev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 462b90b73f93..fd4e33654863 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -252,6 +252,7 @@ struct mii_bus {
int reset_delay_us;
/* RESET GPIO descriptor pointer */
struct gpio_desc *reset_gpiod;
+ struct ptp_system_timestamp *ptp_sts;
};
#define to_mii_bus(d) container_of(d, struct mii_bus, dev)

--
2.22.0