Re: [PATCH] optoe: driver to read/write SFP/QSFP EEPROMs

From: Florian Fainelli
Date: Thu Jun 14 2018 - 23:24:50 EST

On 06/14/2018 07:26 PM, Don Bollinger wrote:
> On Tue, Jun 12, 2018 at 08:11:09PM +0200, Andrew Lunn wrote:
>>> There's an SFP driver under drivers/net/phy. Can that driver be extended
>>> to provide this support? Adding Russel King who developed sfp.c, as well
>>> at the netdev mailing list.
>> I agree, the current SFP code should be used.
>> My observations seem to be there are two different ways {Q}SFP are used:
>> 1) The Linux kernel has full control, as assumed by the devlink/SFP
>> frame work. We parse the SFP data to find the capabilities of the SFP
>> and use it to program the MAC to use the correct mode. The MAC can be
>> a NIC, but it can also be a switch. DSA is gaining support for
>> PHYLINK, so SFP modules should just work with most switches which DSA
>> support. And there is no reason a plain switchdev switch can not use
>> 2) Firmware is in control of the PHY layer, but there is a wish to
>> expose some of the data which is available via i2c from the {Q}SFP to
>> linux.
>> It appears this optoe supports this second case. It does not appear to
>> support any in kernel API to actually make use of the SFP data in the
>> kernel.
>> We should not be duplicating code. We should share the SFP code for
>> both use cases above. There is also a Linux standard API for getting
>> access to this information. ethtool -m/--module-info. Anything which
>> is exporting {Q}SFP data needs to use this API.
>> Andrew
> Actually this is better described by a third use case. The target
> switches are PHY-less (see various designs at
> The AS5712 for example
> says "The AS5712-54X is a PHY-Less design with the SFP+ and QSFP+
> connections directly attaching to the Serdes interfaces of the Broadcom
> BCM56854 720G Trident 2 switching silicon..."
> The electrical controls of the {Q}SFP devices (TxDisable for example)
> are organized in a platform dependent way, through CPLD devices, and
> managed by a platform specific CPLD driver.
> The i2c bus is muxed from the CPU to all of the {Q}SFP devices, which
> are set up as standard linux i2c devices
> (/sys/bus/i2c/devices/i2c-xxxx).
> There is no MDIO bus between the CPU and the {Q}SFP devices.
>> 2) Firmware is in control of the PHY layer, but there is a wish to
>> expose some of the data which is available via i2c from the {Q}SFP to
>> linux.
> So the switch silicon is in control of the PHY layer. The platform
> driver is in control of the electrical interfaces. And the EEPROM data
> is available via I2C.
> And, there isn't actually 'a wish to expose' the EEPROM data to linux
> (the kernel). It turns out that none of the NOS partners I'm working
> with use that data *in the kernel*. It is all managed from user space.
> More generally, I think sfp.c and optoe are not actually trying to
> accomplish the same thing at all. sfp.c combines all three functions
> (PHY, electrical control, EEPROM access). optoe is only providing EEPROM
> access, and only to user space. This is a real need in the white box
> switch environment, and is not met by sfp.c. optoe isn't better, sfp.c
> isn't better, they're just different.

sfp exposes standard ethtool hooks such as get_module_info() which pass
the whole data blob to user-space, e.g: ethtool where all of this is
better interpreted.

> Finally, sfp.c does not recognize that SFP devices have data beyond 512
> bytes, accessible via a page register. It also does not recognize QSFP
> devices at all. QSFP devices have only 256 bytes accessible (one i2c
> address) before switching to paged access for the remaining data. The
> first design requirement for optoe was to access all the available
> pages, because there is information and controls that we (optics
> vendors) want to make available to network management applications.

Patches welcome if you wish to extend sfp.c to support QSFP devices for

> If sfp.c creates a standard linux i2c client for each SFP device, it
> should be possible to create an optoe managed device 'under' sfp.c to
> provide access to the full EEPROM address space:

It's the other way around, SFP relies on a standard Linux i2c bus master
to exist such that it can read the EEPROM from the standard slave
address location, same goes with a possibly present PHY.

> # echo optoe2 0x50 >/sys/bus/i2c/devices/i2c-xx/new_device
> This might prove useful to user space consumers of that data. We could
> also easily add a kernel API (eg the nvmem framework) to optoe to provide
> kernel access. In other words, sfp.c could assign EEPROM management to
> optoe, while managing the electrical interfaces. (This is actually
> pretty close to how the platfom drivers work in the switch environment.)
> sfp.c would get SFP page support and QSFP EEPROM access 'for free'.

That sounds like a possibly good approach.

>> There is also a Linux standard API for getting
>> access to this information. ethtool -m/--module-info. Anything which
>> is exporting {Q}SFP data needs to use this API.
> optoe simply provides direct access from user space to the full EEPROM
> data. There is more data there than ethtool knows about, and in some
> devices there are proprietary registers that ethtool will never know
> about. optoe does not interpret any of the EEPROM content (except the
> bare minimum to access pages correctly). optoe also does not get in the
> way of ethtool. It could prove to be a handy way for ethtool to access
> new EEPROM fields in the future. QSFP-DD/OSFP are coming soon, they
> will have a different (incompatible) set of new fields to be decoded.

sfp is the same it only passes the EEPROM information to user-space and
interprets just what it needs to get the work done.

> Bottom Line: sfp.c is not a useful starting point for the switch
> environment I'm working in. The underlying hardware architecture is
> quite different. optoe is not a competing alternative. Its only
> function is to provide user-space access to the EEPROM data in {Q}SFP
> devices.

I just don't understand why would we want optoe when the standard way to
expose both EEPROM and diagnostics modules has been through ethtool's
get_module_info since the basic paradigm is that a network port is a
net_device instance in the kernel. If that basic device driver model
does not exist, then it is unclear to me what are the benefits.

Would I be completely wrong if I wrote that you are likely working with
a switch SDK which primarily runs in user-space and so with lack of a
proper kernel-based device driver for your piece of hardware you are
attempting to create a driver which is some sort of a "tap" for some
specific portion of that larger hardware block?