Re: [PATCH v10 3/5] drivers/soc/litex: add LiteX SoC Controller driver

From: Stafford Horne
Date: Mon Sep 14 2020 - 09:36:53 EST


On Mon, Sep 14, 2020 at 12:33:11PM +0200, Mateusz Holenko wrote:
> On Fri, Sep 11, 2020 at 2:57 AM Stafford Horne <shorne@xxxxxxxxx> wrote:
> >
> > On Wed, Aug 12, 2020 at 02:34:34PM +0200, Mateusz Holenko wrote:
> > > From: Pawel Czarnecki <pczarnecki@xxxxxxxxxxxxxxxxxxxxxxxx>
> > >
> > > This commit adds driver for the FPGA-based LiteX SoC
> > > Controller from LiteX SoC builder.
> > >
> > > Co-developed-by: Mateusz Holenko <mholenko@xxxxxxxxxxxx>
> > > Signed-off-by: Mateusz Holenko <mholenko@xxxxxxxxxxxx>
> > > Signed-off-by: Pawel Czarnecki <pczarnecki@xxxxxxxxxxxxxxxxxxxxxxxx>
> > > ---
> > > + node = dev->of_node;
> > > + if (!node)
> > > + return -ENODEV;

We return here without BUG() if the setup fails.

> > > +
> > > + soc_ctrl_dev = devm_kzalloc(dev, sizeof(*soc_ctrl_dev), GFP_KERNEL);
> > > + if (!soc_ctrl_dev)
> > > + return -ENOMEM;

We return here without BUG() if we are out of memory.

> > > +
> > > + soc_ctrl_dev->base = devm_platform_ioremap_resource(pdev, 0);
> > > + if (IS_ERR(soc_ctrl_dev->base))
> > > + return PTR_ERR(soc_ctrl_dev->base);

Etc.

> > > +
> > > + result = litex_check_csr_access(soc_ctrl_dev->base);
> > > + if (result) {
> > > + // LiteX CSRs access is broken which means that
> > > + // none of LiteX drivers will most probably
> > > + // operate correctly
> > The comment format here with // is not usually used in the kernel, but its not
> > forbidded. Could you use the /* */ multiline style?
>
> Sure, I'll change the commenting style here.
>
> >
> > > + BUG();
> > Instead of stopping the system with BUG, could we just do:
> >
> > return litex_check_csr_access(soc_ctrl_dev->base);
> >
> > We already have failure for NODEV/NOMEM so might as well not call BUG() here
> > too.
>
> It's true that litex_check_csr_accessors() already generates error
> codes that could be
> returned directly.
> The point of using BUG() macro here, however, is to stop booting the
> system so that it's visible
> (and impossible to miss for the user) that an unresolvable HW issue
> was encountered.
>
> CSR-accessors - the litex_{g,s}et_reg() functions - are intended to be
> used by other LiteX drivers
> and it's very unlikely that those drivers would work properly after
> the fail of litex_check_csr_accessors().
> Since in such case the UART driver will be affected too (no boot logs
> and error messages visible to the user),
> I thought it'll be easier to spot and debug the problem if the system
> stopped in the BUG loop.
> Perhaps there are other, more linux-friendly, ways of achieving a
> similar goal - I'm open for suggestions.

I see your point, but I thought if failed with an exit status above, we could do
the same here. But I guess failing here means that something is really wrong as
validation failed.

Some points:
- If we return here, the system will still boot but there will be no UART
- If we bail with BUG(), here the system stops, and there is no UART
- Both cases the user can connect with a debugger and read "dmesg", to see what
is wrong, but BUG() does not print an error message on all architectures.

We could also use:

- WARN(1, "Failed to validate CSR registers, the system is probably broken.");

If you want to keep BUG() it may be fine.

I am not an expert on handling these type of bailout's so other input is
appreciated.

-Stafford