Re: Circular dependency between DSA switch driver and tagging protocol driver

From: Vladimir Oltean
Date: Wed Sep 29 2021 - 10:07:37 EST


On Wed, Sep 08, 2021 at 04:36:00PM -0700, Florian Fainelli wrote:
> On 9/8/2021 3:19 PM, Vladimir Oltean wrote:
> > On Wed, Sep 08, 2021 at 03:14:51PM -0700, Florian Fainelli wrote:
> > > On 9/8/2021 3:08 PM, Vladimir Oltean wrote:
> > > > Hi,
> > > >
> > > > Since commits 566b18c8b752 ("net: dsa: sja1105: implement TX
> > > > timestamping for SJA1110") and 994d2cbb08ca ("net: dsa: tag_sja1105: be
> > > > dsa_loop-safe"), net/dsa/tag_sja1105.ko has gained a build and insmod
> > > > time dependency on drivers/net/dsa/sja1105.ko, due to several symbols
> > > > exported by the latter and used by the former.
> > > >
> > > > So first one needs to insmod sja1105.ko, then insmod tag_sja1105.ko.
> > > >
> > > > But dsa_port_parse_cpu returns -EPROBE_DEFER when dsa_tag_protocol_get
> > > > returns -ENOPROTOOPT. It means, there is no DSA_TAG_PROTO_SJA1105 in the
> > > > list of tagging protocols known by DSA, try again later. There is a
> > > > runtime dependency for DSA to have the tagging protocol loaded. Combined
> > > > with the symbol dependency, this is a de facto circular dependency.
> > > >
> > > > So when we first insmod sja1105.ko, nothing happens, probing is deferred.
> > > >
> > > > Then when we insmod tag_sja1105.ko, we expect the DSA probing to kick
> > > > off where it left from, and probe the switch too.
> > > >
> > > > However this does not happen because the deferred probing list in the
> > > > device core is reconsidered for a new attempt only if a driver is bound
> > > > to a new device. But DSA tagging protocols are drivers with no struct
> > > > device.
> > > >
> > > > One can of course manually kick the driver after the two insmods:
> > > >
> > > > echo spi0.1 > /sys/bus/spi/drivers/sja1105/bind
> > > >
> > > > and this works, but automatic module loading based on modaliases will be
> > > > broken if both tag_sja1105.ko and sja1105.ko are modules, and sja1105 is
> > > > the last device to get a driver bound to it.
> > > >
> > > > Where is the problem?
> > >
> > > I'd say with 994d2cbb08ca, since the tagger now requires visibility into
> > > sja1105_switch_ops which is not great, to say the least. You could solve
> > > this by:
> > >
> > > - splitting up the sja1150 between a library that contains
> > > sja1105_switch_ops and does not contain the driver registration code
> > >
> > > - finding a different way to do a dsa_switch_ops pointer comparison, by
> > > e.g.: maintaining a boolean in dsa_port that tracks whether a particular
> > > driver is backing that port
> >
> > What about 566b18c8b752 ("net: dsa: sja1105: implement TX timestamping for SJA1110")?
> > It is essentially the same problem from a symbol usage perspective, plus
> > the fact that an skb queue belonging to the driver is accessed.
>
> I believe we will have to accept that another indirect function call must be
> made in order to avoid creating a direct symbol dependency with
> sja1110_rcv_meta() would that be acceptable performance wise?
> --
> Florian

The same circular dependency problem exists between net/dsa/tag_ocelot_8021q.c
and drivers/net/ethernet/mscc/ocelot.c, which exports ocelot_port_inject_frame
and co. This one is fundamentally even worse, I could make all of the
ocelot_port_inject_frame related functions static inline inside
include/linux/dsa/ocelot.h, but for that to do anything, I'd also need
to make the I/O functions themselves static inline, like __ocelot_write_ix
which is called by ocelot_write_rix. That doesn't sound too fun, sooner
or later I'm going to make the entire driver static inline :)

The alternative I see would be to just inject the PTP frames from a
deferred worker, a la sja1105 with its magical SPI commands. A positive
side effect of doing that would be that I can even try harder to inject
them. Currently, because DSA uses NETIF_F_LLTX, it can't return
NETDEV_TX_BUSY to ask the qdisc to requeue the packet, because the qdisc
is, well, noqueue (we talked about this before). But "ocelot_can_inject"
can return false, due to nasty stuff like egress congestion. The current
behavior is to immediately drop the PTP frame, which leads to its own
kind of nastiness - we have that skb enqueued in a list of frames
awaiting TX timestamps, but the timestamp for it will never come because
it's been silently dropped. Then we have a timestamp ID that rolls over
after it counts to 4, and since we keep counting, future PTP timestamps
might match on the skb that is still in the TX timestamping queue but
stale. And nobody will match on the real skb. It goes downhill really
fast and stinks.

What do you think, should I go ahead and make a second user of deferred
xmit (first and foremost to break the cyclic dependency between the
tagger and the driver), and try harder to enqueue the PTP packets
through register-based MMIO?