Re: [PATCH 1/2] mtd: spi-nor: When a flash memory is missing do not report an error

From: Michal Suchánek
Date: Sun Jul 24 2022 - 12:00:03 EST


On Sat, Jul 16, 2022 at 11:44:48AM +0200, Michael Walle wrote:
> Am 2022-07-16 11:38, schrieb Michal Suchánek:
> > On Sat, Jul 16, 2022 at 11:30:12AM +0200, Michael Walle wrote:
> > > Am 2022-07-16 10:20, schrieb Michal Suchánek:
> > >
> > > > > So if DT says there isn't a flash on a specific CS when there is, then
> > > > > DT should be fixed to describe the flash, and then we can probe it.
> > > > > You
> > > > > both seem to be saying the same thing here, and I agree.
> > > >
> > > > The disagreement is about the situation when there is sometimes a flash
> > > > chip.
> > >
> > > No. The disagreement is what should happen if the DT says there is
> > > a device but there isn't. Which right now is an error and it should
> > > stay that way. Your hardware description says there is a flash
> > > but it cannot be probed, so it is an error. What about a board
> > > which has an actual error and the flash isn't responding? You
> > > trade one use case for another.
> >
> > And what if you have a SATA controller with a bad cable?
> >
> > Or a bad connection to a mmc card?
>
> Again. You don't tell the kernel via the DT that there is an
> MMC card nor that there is a SATA SDD. In contrast to SPI-NOR..

So what?

there is some firmware table that can enable/disable individual SATA
ports, and some BIOSes have the option to disable individual ports.

Sure, you could insist that users should disable unused ports, and
having enabled port with no disk is an error but it's unreasonable - the
users would have to flip the setting unnecessarily because the kernel
can probe the presence of the disk but insists on assuming there is
one on every port. So people would rightfully point out that it's not
inherently an error to have a SATA port but no disk attached.

What I want is merely the same level of service for SPI NOR - that the
user does not need to switch the devicetrees when the presence of a
flash chip can be probed.

You keep repeting that devicetree describes the hardware but that's not
what it does.

"A |spec|-compliant devicetree describes device information in a system
that cannot necessarily be dynamically detected by a client program."

So it does not describe 'hardware', it describes hardware properties
that cannot be probed.

Let's dissect the mmc situation in a bit more detail:

If the device tree describes a MMC bus it means that there are some
traces attached to some hardware pins, and that the pinmux should be
configured to expose a mmc peripherial on these pins.

The decision to use these pins for mmc is somewhat arbitrary. It must be
specified in the devicetree because it cannot be probed but on many
boards you could use the same pins for different hardware peripherial
such as an UART, and with bitbanging low-speed protocols the
possibilities are endless. Even if there is a SD socket that does not
really mean anything - there are many abuses of standard connectors for
non-standard purposes all around.

Nonetheless, if at least some boards come with a MMC device soldered to
the pins or the documentation advises to use a SD/SDIO card then the
pins should be configured as MMC, and it is not invalidated by the fact
that the pins could theoretically be abused for other purposes.

Also on some boards there is always a MMC device soldered, and lack of
the device could be considered an error. Nonetheless, on other boards
the device may be present only on some boards or socketed. The mmc
driver is driver for all of these boards so it has to deal with missing
device gracefully.

Sure, it means that for some boards that have damaged hardware no error
is reported. That's the result of drivers being general purpose, not
baord specific. The same way on some PC mainboards you could consider
missing SATA drive on a specific port an error but the driver is
generic, and should handle also boards that don't necessarily always
come with all ports populated.

There is also the possibility that you have a SDIO WiFi card that does
not have an eeprom with a MAC address, and an EEPROM is mounted
somewhere on the board to store it. In thios case although the MMC bus
is 'discoverable', and the presence and type of the SDIO card can be
probed all right the relationship to the eeprom cannot, and has to be
recorded in the device tree. Will missing the SDIO card cause an error?
I doubt it, MMC is the great 'discoverable' bus after all so here you
have it - a hardware is described, it's missing, and it's not an error.

So for SPI NOR we similarily have some traces that are connected to some
hardware pins, and given that the board comes with a SPI NOR at least
sometimes, or it is documented that these traces should be used for
mounting a SPI NOR that should be described in the device tree. It is
not invalidated by the fact that not all boards come with the SPI NOR or
that it comes in an optional separate bag with accessories or that the
pins coud be theoretically abused for something else.

Now because SPI bus is not 'discoverable' not only the arbitrary
decision to mux the SPI peripehrial on those pins needs to be recorded
but also the arbitrary decision to use the pins to acces a SPI NOR, and
not any other random SPI peripherial.

Again, the spi-nor driver is driver not only the boards that always come
with the flash soldered but also for the boards where it is mounted only
sometimes. It should deal gracefully with the flash missing as much as
it deals gracefully with different flash memory sizes, erase block
sizes, etc. It's a parameter that can be probed.

> > Then the kernel also does not say there is an error and simply does not
> > see the device.
> >
> > This is normal. Not all devices that can potentially exist do exist. It
> > is up to the user to decide if it's an error that the device is missing.
> >
> > > Also I've looked at the PHY subsystem and there, if a PHY is described
> > > in the DT but isn't there, the following error will be printed:
> > > dev_err(&mdio->dev, "MDIO device at address %d is missing.\n",
> > > addr);
> > >
> > > And that is for a bus which can even be automatically be
> > > probed/detected.
> >
> > If there is no use case for having a card with unpopulated PHY then it
> > makes sense.
> >
> > Here we do have a use case so the comparison is moot.
>
> And what use case is that? You are just demoting an error

Boards with optional SPI NOR device.

> to an info. Apparently you just want to see that error
> go away, there is no change in functionality.

Yes, it's a sign of a bad assumption on the part of the driver. Sure, it
might have been true in the past but with new hardware it's no longer
the case so the driver should adjust.

Currently the driver problem is worked around by being over-specific
in devicetree but ultimately the driver is broken and should be fixed.

Thanks

Michal