Re: [PATCH v3 3/5] net: Let the active time stamping layer be selectable.
From: Horatiu Vultur
Date: Fri Mar 10 2023 - 08:15:55 EST
The 03/10/2023 13:15, Michael Walle wrote:
>
> [+ Horatiu]
>
> Am 2023-03-10 12:35, schrieb Vladimir Oltean:
> > On Fri, Mar 10, 2023 at 11:48:52AM +0100, Köry Maincent wrote:
> > > > From previous discussions, I believe that a device tree property was
> > > > added in order to prevent perceived performance regressions when
> > > > timestamping support is added to a PHY driver, correct?
> > >
> > > Yes, i.e. to select the default and better timestamp on a board.
> >
> > Is there a way to unambiguously determine the "better" timestamping on
> > a board?
> >
> > Is it plausible that over time, when PTP timestamping matures and,
> > for example, MDIO devices get support for PTP_SYS_OFFSET_EXTENDED
> > (an attempt was here: https://lkml.org/lkml/2019/8/16/638), the
> > relationship between PTP clock qualities changes, and so does the
> > preference change?
> >
> > > > I have a dumb question: if updating the device trees is needed in order
> > > > to prevent these behavior changes, then how is the regression problem
> > > > addressed for those device trees which don't contain this new property
> > > > (all device trees)?
> > >
> > > On that case there is not really solution,
> >
> > If it's not really a solution, then doesn't this fail at its primary
> > purpose of preventing regressions?
> >
> > > but be aware that CONFIG_PHY_TIMESTAMPING need to be activated to
> > > allow timestamping on the PHY. Currently in mainline only few (3)
> > > defconfig have it enabled so it is really not spread,
> >
> > Do distribution kernels use the defconfigs from the kernel, or do they
> > just enable as many options that sound good as possible?
> >
> > > maybe I could add more documentation to prevent further regression
> > > issue when adding support of timestamp to a PHY driver.
> >
> > My opinion is that either the problem was not correctly identified,
> > or the proposed solution does not address that problem.
> >
> > What I believe is the problem is that adding support for PHY
> > timestamping
> > to a PHY driver will cause a behavior change for existing systems which
> > are deployed with that PHY.
> >
> > If I had a multi-port NIC where all ports share the same PHC, I would
> > want to create a boundary clock with it. I can do that just fine when
> > using MAC timestamping. But assume someone adds support for PHY
> > timestamping and the kernel switches to using PHY timestamps by
> > default.
> > Now I need to keep in sync the PHCs of the PHYs, something which was
> > implicit before (all ports shared the same PHC). I have done nothing
> > incorrectly, yet my deployment doesn't work anymore. This is just an
> > example. It doesn't sound like a good idea in general for new features
> > to cause a behavior change by default.
> >
> > Having identified that as the problem, I guess the solution should be
> > to stop doing that (and even though a PHY driver supports timestamping,
> > keep using the MAC timestamping by default).
> >
> > There is a slight inconvenience caused by the fact that there are
> > already PHY drivers using PHY timestamping, and those may have been
> > introduced into deployments with PHY timestamping. We cannot change the
> > default behavior for those either. There are 5 such PHY drivers today
> > (I've grepped for mii_timestamper in drivers/net/phy).
> >
> > I would suggest that the kernel implements a short whitelist of 5
> > entries containing PHY driver names, which are compared against
> > netdev->phydev->drv->name (with the appropriate NULL pointer checks).
> > Matches will default to PHY timestamping. Otherwise, the new default
> > will be to keep the behavior as if PHY timestamping doesn't exist
> > (MAC still provides the timestamps), and the user needs to select the
> > PHY as the timestamping source explicitly.
> >
> > Thoughts?
>
> While I agree in principle (I have suggested to make MAC timestamping
> the default before), I see a problem with the recent LAN8814 PHY
> timestamping support, which will likely be released with 6.3. That
> would now switch the timestamping to PHY timestamping for our board
> (arch/arm/boot/dts/lan966x-kontron-kswitch-d10-mmt-8g.dts). I could
> argue that is a regression for our board iff NETWORK_PHY_TIMESTAMPING
> is enabled. Honestly, I don't know how to proceed here and haven't
> tried to replicate the regression due to limited time. Assuming,
> that I can show it is a regression, what would be the solution then,
> reverting the commit? Horatiu, any ideas?
I don't think reverting the commit is the best approach. Because this
will block adding any timestamp support to any of the existing PHYs.
Maybe a better solution is to enable or disable NETWORK_PHY_TIMESTAMPING
depending where you want to do the timestamp.
>
> I digress from the original problem a bit. But if there would be such
> a whitelist, I'd propose that it won't contain the lan8814 driver.
I don't have anything against having a whitelist the PHY driver names.
>
> Other than that, I guess I have to put some time into testing
> before it's too late.
I was thinking about another scenario (I am sorry if this was already
discussed).
Currently when setting up to do the timestamp, the MAC will check if the
PHY has timestamping support if that is the case the PHY will do the
timestamping. So in case the switch was supposed to be a TC then we had
to make sure that the HW was setting up some rules not to forward PTP
frames by HW but to copy these frames to CPU.
With this new implementation, this would not be possible anymore as the
MAC will not be notified when doing the timestamping in the PHY.
Does it mean that now the switch should allocate these rules at start
time?
>
> -michael
--
/Horatiu