Re: [PATCH v11 2/3] fpga: microchip-spi: add Microchip MPF FPGA manager

From: Ivan Bornyakov
Date: Wed May 11 2022 - 07:47:10 EST


On Wed, May 11, 2022 at 11:36:33AM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
> On 11/05/2022 09:15, Ivan Bornyakov wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >
> > On Tue, May 10, 2022 at 12:29:54PM +0100, Conor Dooley wrote:
> >> On 09/05/2022 19:56, Conor Dooley wrote:
> >>> On 09/05/2022 18:16, Ivan Bornyakov wrote:
> >>>> On Mon, May 09, 2022 at 11:41:18AM +0000, Conor.Dooley@xxxxxxxxxxxxx wrote:
> >>>>> Hey Ivan, one comment below.
> >>>>> Thanks,
> >>>>> Conor.
> >>>>>
> >>>>> On 07/05/2022 08:43, Ivan Bornyakov wrote:
> >>>>>> ... snip ...
> >>>>>> +static int mpf_read_status(struct spi_device *spi)
> >>>>>> +{
> >>>>>> + u8 status, status_command = MPF_SPI_READ_STATUS;
> >>>>>> + struct spi_transfer xfer = {
> >>>>>> + .tx_buf = &status_command,
> >>>>>> + .rx_buf = &status,
> >>>>>> + .len = 1,
> >>>>>> + };
> >>>>>> + int ret = spi_sync_transfer(spi, &xfer, 1);
> >>>>>> +
> >>>>>> + if ((status & MPF_STATUS_SPI_VIOLATION) ||
> >>>>>> + (status & MPF_STATUS_SPI_ERROR))
> >>>>>> + ret = -EIO;
> >>>>>> +
> >>>>>> + return ret ? : status;
> >>>>>> +}
> >>>>>> +
> >>>>>> ... snip ...
> >>>>>> +
> >>>>>> +static int poll_status_not_busy(struct spi_device *spi, u8 mask)
> >>>>>> +{
> >>>>>> + int status, timeout = MPF_STATUS_POLL_TIMEOUT;
> >>>>>> +
> >>>>>> + while (timeout--) {
> >>>>>> + status = mpf_read_status(spi);
> >>>>>> + if (status < 0 ||
> >>>>>> + (!(status & MPF_STATUS_BUSY) && (!mask || (status & mask))))
> >>>>>> + return status;
> >>>>>> +
> >>>>>> + usleep_range(1000, 2000);
> >>>>>> + }
> >>>>>> +
> >>>>>> + return -EBUSY;
> >>>>>> +}
> >>>>>
> >>>>> Is there a reason you changed this from the snippet you sent me
> >>>>> in the responses to version 8:
> >>>>> static int poll_status_not_busy(struct spi_device *spi, u8 mask)
> >>>>> {
> >>>>> u8 status, status_command = MPF_SPI_READ_STATUS;
> >>>>> int ret, timeout = MPF_STATUS_POLL_TIMEOUT;
> >>>>> struct spi_transfer xfer = {
> >>>>> .tx_buf = &status_command,
> >>>>> .rx_buf = &status,
> >>>>> .len = 1,
> >>>>> };
> >>>>>
> >>>>> while (timeout--) {
> >>>>> ret = spi_sync_transfer(spi, &xfer, 1);
> >>>>> if (ret < 0)
> >>>>> return ret;
> >>>>>
> >>>>> if (!(status & MPF_STATUS_BUSY) && (!mask || (status & mask)))
> >>>>> return status;
> >>>>>
> >>>>> usleep_range(1000, 2000);
> >>>>> }
> >>>>>
> >>>>> return -EBUSY;
> >>>>> }
> >>>>>
> >>>>> With the current version, I hit the "Failed to write bitstream
> >>>>> frame" check in mpf_ops_write at random points in the transfer.
> >>>>> Replacing poll_status_not_busy with the above allows it to run
> >>>>> to completion.
> >>>>
> >>>> In my eyes they are equivalent, aren't they?
> >>>>
> >>>
> >>> I was in a bit of a rush today & didn't have time to do proper
> >>> debugging, I'll put some debug code in tomorrow and try to find
> >>> exactly what is different between the two.
> >>>
> >>> Off the top of my head, since I don't have a board on me to test,
> >>> the only difference I can see is that with the snippet you only
> >>> checked if spi_sync_transfer was negative whereas now you check
> >>> if it has a value at all w/ that ternary operator.
> >>>
> >>> But even that seems like it *shouldn't* be the problem, since ret
> >>> should contain -errno or zero, right?
> >>> Either way, I will do some digging tomorrow.
> >>
> >> I put a printk("status %x, ret %d", status, ret); into the failure
> >> path of mpf_read_status() & it looks like a status 0xA is being
> >> returned - error & ready? That seems like a very odd combo to be
> >> getting back out of it. It shouldn't be dodgy driver/connection
> >> either, b/c that's what I see if I connect my protocol analyser:
> >> https://i.imgur.com/VbjgfCk.png
> >>
> >> That's mosi (hex), ss, sclk, mosi, miso (hex), miso in descending
> >> order.
> >>
> >> I think what was happening was with the snippet you returned one
> >> of the following: -EBUSY, ret (aka -errno) or status. Since status
> >> is positive, the checks in mpf_spi_write.*() saw nothing wrong at
> >> all and programming continued despite there being a problem.
> >>
> >> The new version fixes this by returning -EIO rather than status from
> >> poll_status_not_busy().
> >>
> >> I wish I had a socketable PolarFire so I could investigate further,
> >> but this looks like it might a be hardware issue somewhere on my
> >> end?
> >>
> >> So ye, sorry for the noise and carry on! I'll try tofind what is to
> >> blame for it.
> >>
> >> Thanks,
> >> Conor.
> >>
> >
> > Hi, Conor.
> >
> > I've just noticed in SPI-DirectC User Guide [1] ch. 9 SmartFusion2 and
> > IGLOO2 SPI-Slave Programming Waveform Analysis, that hw status checked
> > two times every time. Does MPF family also need double check hw status?
> > Does adding second mpf_read_status() to poll_status_not_busy() routine
> > help with your issue?
>
> Hey Ivan,
> Tried your suggestion. Previously I was failing quite consistently at
> transfer 34 of 590k, and sometimes making it a further. With your
> suggestion, I was making it significantly further (100k+) but still
> running into some of the 0xA status.
> Decided to move the double check into mpfs_read_status (see the below
> diff) did not run into any the 0xA statuses.
> It's worth pointing out that this is the *first* time I have seen
> Flash Pro Express report that the FPGA array has been enabled after
> programming!

That's good news!

> Seems like at the very least this (hacky) diff is not harmful?
> Please give it a try yourself and check that things still work for
> you.
>
> diff --git a/drivers/fpga/microchip-spi.c b/drivers/fpga/microchip-spi.c
> index 63b75dff2522..183cdfc05c4a 100644
> --- a/drivers/fpga/microchip-spi.c
> +++ b/drivers/fpga/microchip-spi.c
> @@ -47,18 +47,30 @@ struct mpf_priv {
> static int mpf_read_status(struct spi_device *spi)
> {
> u8 status, status_command = MPF_SPI_READ_STATUS;
> + u8 status_repeat;
> struct spi_transfer xfer = {
> .tx_buf = &status_command,
> .rx_buf = &status,
> .len = 1,
> };
> + struct spi_transfer xfer_repeat = {
> + .tx_buf = &status_command,
> + .rx_buf = &status_repeat,
> + .len = 1,
> + };
> int ret = spi_sync_transfer(spi, &xfer, 1);
> + int ret_repeat = spi_sync_transfer(spi, &xfer_repeat, 1);
> +
> + if (ret || ret_repeat)
> + return -EIO;
>
> - if ((status & MPF_STATUS_SPI_VIOLATION) ||
> - (status & MPF_STATUS_SPI_ERROR))
> + if (status != status_repeat)
> + printk("status disagreement %x %x", status, status_repeat);
> + if ((status_repeat & MPF_STATUS_SPI_VIOLATION) ||
> + (status_repeat & MPF_STATUS_SPI_ERROR))
> ret = -EIO;
>
> - return ret ? : status;
> + return ret ?: status_repeat;
> }
>
> static enum fpga_mgr_states mpf_ops_state(struct fpga_manager *mgr)

I'll check that and send out v12 if it's all rigth in the near future.