Re: [PATCH] optoe: driver to read/write SFP/QSFP EEPROMs

From: Don Bollinger
Date: Thu Jun 14 2018 - 22:27:03 EST


On Tue, Jun 12, 2018 at 08:11:09PM +0200, Andrew Lunn wrote:
> > There's an SFP driver under drivers/net/phy. Can that driver be extended
> > to provide this support? Adding Russel King who developed sfp.c, as well
> > at the netdev mailing list.
>
> I agree, the current SFP code should be used.
>
> My observations seem to be there are two different ways {Q}SFP are used:
>
> 1) The Linux kernel has full control, as assumed by the devlink/SFP
> frame work. We parse the SFP data to find the capabilities of the SFP
> and use it to program the MAC to use the correct mode. The MAC can be
> a NIC, but it can also be a switch. DSA is gaining support for
> PHYLINK, so SFP modules should just work with most switches which DSA
> support. And there is no reason a plain switchdev switch can not use
> PHYLINK.
>
> 2) Firmware is in control of the PHY layer, but there is a wish to
> expose some of the data which is available via i2c from the {Q}SFP to
> linux.
>
> It appears this optoe supports this second case. It does not appear to
> support any in kernel API to actually make use of the SFP data in the
> kernel.
>
> We should not be duplicating code. We should share the SFP code for
> both use cases above. There is also a Linux standard API for getting
> access to this information. ethtool -m/--module-info. Anything which
> is exporting {Q}SFP data needs to use this API.
>
> Andrew

Actually this is better described by a third use case. The target
switches are PHY-less (see various designs at
www.compute.org/wiki/Networking/SpecsAndDesigns). The AS5712 for example
says "The AS5712-54X is a PHY-Less design with the SFP+ and QSFP+
connections directly attaching to the Serdes interfaces of the Broadcom
BCM56854 720G Trident 2 switching silicon..."

The electrical controls of the {Q}SFP devices (TxDisable for example)
are organized in a platform dependent way, through CPLD devices, and
managed by a platform specific CPLD driver.

The i2c bus is muxed from the CPU to all of the {Q}SFP devices, which
are set up as standard linux i2c devices
(/sys/bus/i2c/devices/i2c-xxxx).

There is no MDIO bus between the CPU and the {Q}SFP devices.

> 2) Firmware is in control of the PHY layer, but there is a wish to
> expose some of the data which is available via i2c from the {Q}SFP to
> linux.

So the switch silicon is in control of the PHY layer. The platform
driver is in control of the electrical interfaces. And the EEPROM data
is available via I2C.

And, there isn't actually 'a wish to expose' the EEPROM data to linux
(the kernel). It turns out that none of the NOS partners I'm working
with use that data *in the kernel*. It is all managed from user space.

More generally, I think sfp.c and optoe are not actually trying to
accomplish the same thing at all. sfp.c combines all three functions
(PHY, electrical control, EEPROM access). optoe is only providing EEPROM
access, and only to user space. This is a real need in the white box
switch environment, and is not met by sfp.c. optoe isn't better, sfp.c
isn't better, they're just different.

Finally, sfp.c does not recognize that SFP devices have data beyond 512
bytes, accessible via a page register. It also does not recognize QSFP
devices at all. QSFP devices have only 256 bytes accessible (one i2c
address) before switching to paged access for the remaining data. The
first design requirement for optoe was to access all the available
pages, because there is information and controls that we (optics
vendors) want to make available to network management applications.

If sfp.c creates a standard linux i2c client for each SFP device, it
should be possible to create an optoe managed device 'under' sfp.c to
provide access to the full EEPROM address space:
# echo optoe2 0x50 >/sys/bus/i2c/devices/i2c-xx/new_device
This might prove useful to user space consumers of that data. We could
also easily add a kernel API (eg the nvmem framework) to optoe to provide
kernel access. In other words, sfp.c could assign EEPROM management to
optoe, while managing the electrical interfaces. (This is actually
pretty close to how the platfom drivers work in the switch environment.)
sfp.c would get SFP page support and QSFP EEPROM access 'for free'.

> There is also a Linux standard API for getting
> access to this information. ethtool -m/--module-info. Anything which
> is exporting {Q}SFP data needs to use this API.

optoe simply provides direct access from user space to the full EEPROM
data. There is more data there than ethtool knows about, and in some
devices there are proprietary registers that ethtool will never know
about. optoe does not interpret any of the EEPROM content (except the
bare minimum to access pages correctly). optoe also does not get in the
way of ethtool. It could prove to be a handy way for ethtool to access
new EEPROM fields in the future. QSFP-DD/OSFP are coming soon, they
will have a different (incompatible) set of new fields to be decoded.

Bottom Line: sfp.c is not a useful starting point for the switch
environment I'm working in. The underlying hardware architecture is
quite different. optoe is not a competing alternative. Its only
function is to provide user-space access to the EEPROM data in {Q}SFP
devices.

Don