RE: [PATCH v10] platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc
From: Liming Sun
Date: Fri Mar 08 2019 - 09:44:43 EST
Andy,
The v11 has been posted.
Thanks!
Liming
> -----Original Message-----
> From: Liming Sun
> Sent: Wednesday, March 6, 2019 3:01 PM
> To: 'Andy Shevchenko' <andy.shevchenko@xxxxxxxxx>
> Cc: David Woods <dwoods@xxxxxxxxxxxx>; Andy Shevchenko <andy@xxxxxxxxxxxxx>; Darren Hart <dvhart@xxxxxxxxxxxxx>; Vadim
> Pasternak <vadimp@xxxxxxxxxxxx>; Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>; Platform Driver <platform-driver-
> x86@xxxxxxxxxxxxxxx>
> Subject: RE: [PATCH v10] platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc
>
> Thanks Andy! Please see my response below. If no further comments, I'll try to post v11 after more testing.
>
> Regards,
> Liming
>
> > -----Original Message-----
> > From: Andy Shevchenko <andy.shevchenko@xxxxxxxxx>
> > Sent: Tuesday, March 5, 2019 10:34 AM
> > To: Liming Sun <lsun@xxxxxxxxxxxx>
> > Cc: David Woods <dwoods@xxxxxxxxxxxx>; Andy Shevchenko <andy@xxxxxxxxxxxxx>; Darren Hart <dvhart@xxxxxxxxxxxxx>; Vadim
> > Pasternak <vadimp@xxxxxxxxxxxx>; Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>; Platform Driver <platform-driver-
> > x86@xxxxxxxxxxxxxxx>
> > Subject: Re: [PATCH v10] platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc
> >
> > On Thu, Feb 28, 2019 at 5:51 PM Liming Sun <lsun@xxxxxxxxxxxx> wrote:
> > >
> > > This commit adds the TmFifo platform driver for Mellanox BlueField
> > > Soc. TmFifo is a shared FIFO which enables external host machine
> > > to exchange data with the SoC via USB or PCIe. The driver is based
> > > on virtio framework and has console and network access enabled.
> >
> > Thank you for an update.
> >
> > Unfortunately more work is needed. My comments below.
> >
> > > +#define MLXBF_TMFIFO_TX_STS__COUNT_RMASK GENMASK(8, 0)
> > > +#define MLXBF_TMFIFO_TX_STS__COUNT_MASK GENMASK(8, 0)
> >
> > > +#define MLXBF_TMFIFO_TX_CTL__LWM_RMASK GENMASK(7, 0)
> > > +#define MLXBF_TMFIFO_TX_CTL__LWM_MASK GENMASK(7, 0)
> >
> > > +#define MLXBF_TMFIFO_TX_CTL__HWM_RMASK GENMASK(7, 0)
> > > +#define MLXBF_TMFIFO_TX_CTL__HWM_MASK GENMASK(15, 8)
> >
> > > +#define MLXBF_TMFIFO_TX_CTL__MAX_ENTRIES_RMASK GENMASK(8, 0)
> > > +#define MLXBF_TMFIFO_TX_CTL__MAX_ENTRIES_MASK GENMASK_ULL(40, 32)
> >
> > > +#define MLXBF_TMFIFO_RX_STS__COUNT_RMASK GENMASK(8, 0)
> > > +#define MLXBF_TMFIFO_RX_STS__COUNT_MASK GENMASK(8, 0)
> >
> > > +#define MLXBF_TMFIFO_RX_CTL__LWM_RMASK GENMASK(7, 0)
> > > +#define MLXBF_TMFIFO_RX_CTL__LWM_MASK GENMASK(7, 0)
> >
> > > +#define MLXBF_TMFIFO_RX_CTL__HWM_RMASK GENMASK(7, 0)
> > > +#define MLXBF_TMFIFO_RX_CTL__HWM_MASK GENMASK(15, 8)
> >
> > > +#define MLXBF_TMFIFO_RX_CTL__MAX_ENTRIES_RMASK GENMASK(8, 0)
> > > +#define MLXBF_TMFIFO_RX_CTL__MAX_ENTRIES_MASK GENMASK_ULL(40, 32)
> >
> > Since two of them have _ULL suffix I'm wondering if you have checked
> > for side effects on the rest, i.e. if you operate with 64-bit variable
> > and use something like ~MLXBF_TMFIFO_RX_CTL__MAX_ENTRIES_RMASK, it may
> > give you interesting results.
>
> The running system on the SoC is arm64 where BITS_PER_LONG and
> BITS_PER_LONG_LONG have the same value. In such case, the two macros appears
> to be the same. But you're right, I should use GENMASK_ULL() to be consistent
> and more correctly, just in case the "CONFIG_64BIT" is not defined somehow.
>
> Will update it in v11.
>
> >
> > > +#define MLXBF_TMFIFO_TIMER_INTERVAL (HZ / 10)
> >
> > > +/**
> > > + * mlxbf_tmfifo_u64 - Union of 64-bit data
> > > + * @data - 64-bit data in host byte order
> > > + * @data_le - 64-bit data in little-endian byte order
> > > + *
> > > + * It's expected to send 64-bit little-endian value (__le64) into the TmFifo.
> > > + * readq() and writeq() expect u64 instead. A union structure is used here
> > > + * to workaround the explicit casting usage like writeq(*(u64 *)&data_le).
> > > + */
> >
> > How do you know what readq()/writeq() does with the data? Is it on all
> > architectures?
> > How the endianess conversion affects the actual data?
>
> The SoC runs arm64 and supports little endian for now. The FIFO has two sides,
> one side is the SoC, the other side is extern host machine which could
> access the FIFO via USB or PCIe. The rule is that the 'byte stream' will
> keep the same when one side write 8 bytes and the other side reads
> the 8 bytes. So as long as both sides have agreement on the byte-order
> it should be fine.
>
> After double check the arm64 readq()/writeq() implementation, it appears
> that these APIs already does cpu_to_le64() and le64_to_cpu()
> conversion. There's actually no need to make another conversion
> (and shouldn't do it). I'll remove these conversions in v11. The code will
> look much cleaner.
>
> >
> > > +union mlxbf_tmfifo_u64 {
> > > + u64 data;
> > > + __le64 data_le;
> > > +};
> >
> > > +/*
> > > + * Default MAC.
> > > + * This MAC address will be read from EFI persistent variable if configured.
> > > + * It can also be reconfigured with standard Linux tools.
> > > + */
> > > +static u8 mlxbf_tmfifo_net_default_mac[6] = {
> > > + 0x00, 0x1A, 0xCA, 0xFF, 0xFF, 0x01};
> >
> > > +#define mlxbf_vdev_to_tmfifo(dev) \
> > > + container_of(dev, struct mlxbf_tmfifo_vdev, vdev)
> >
> > One line?
>
> Couldn't fit it into one line within 80 characters.
> (Please correct me if you meant single line even exceeding 80 chracters).
>
> >
> > > +/* Return the consumed Tx buffer space. */
> > > +static int mlxbf_tmfifo_vdev_tx_buf_len(struct mlxbf_tmfifo_vdev *tm_vdev)
> > > +{
> > > + int len;
> > > +
> > > + if (tm_vdev->tx_tail >= tm_vdev->tx_head)
> > > + len = tm_vdev->tx_tail - tm_vdev->tx_head;
> > > + else
> > > + len = MLXBF_TMFIFO_CONS_TX_BUF_SIZE - tm_vdev->tx_head +
> > > + tm_vdev->tx_tail;
> > > + return len;
> > > +}
> >
> > Is this custom implementation of some kind of circ_buf?
>
> Yes. I'll try if I could re-use the circ_buf structure and update it in v11.
>
> >
> > > +/* Allocate vrings for the fifo. */
> > > +static int mlxbf_tmfifo_alloc_vrings(struct mlxbf_tmfifo *fifo,
> > > + struct mlxbf_tmfifo_vdev *tm_vdev)
> > > +{
> > > + struct mlxbf_tmfifo_vring *vring;
> > > + struct device *dev;
> > > + dma_addr_t dma;
> > > + int i, size;
> > > + void *va;
> > > +
> > > + for (i = 0; i < ARRAY_SIZE(tm_vdev->vrings); i++) {
> > > + vring = &tm_vdev->vrings[i];
> > > + vring->fifo = fifo;
> > > + vring->num = MLXBF_TMFIFO_VRING_SIZE;
> > > + vring->align = SMP_CACHE_BYTES;
> > > + vring->index = i;
> > > + vring->vdev_id = tm_vdev->vdev.id.device;
> > > + dev = &tm_vdev->vdev.dev;
> > > +
> > > + size = vring_size(vring->num, vring->align);
> > > + va = dma_alloc_coherent(dev->parent, size, &dma, GFP_KERNEL);
> > > + if (!va) {
> >
> > > + dev_err(dev->parent, "dma_alloc_coherent failed\n");
> >
> > And how do you clean previously allocated items?
>
> Fixed. Check the return value of mlxbf_tmfifo_alloc_vrings() and goto
> 'register_fail' (probably change to a better name) instead of 'fail'.
> In such case the mlxbf_tmfifo_free_vrings() will be called to clean up
> all allocated vrings.
>
> >
> > > + return -ENOMEM;
> > > + }
> > > +
> > > + vring->va = va;
> > > + vring->dma = dma;
> > > + }
> > > +
> > > + return 0;
> > > +}
> >
> > > +/* Disable interrupts of the fifo device. */
> > > +static void mlxbf_tmfifo_disable_irqs(struct mlxbf_tmfifo *fifo)
> > > +{
> > > + int i, irq;
> > > +
> > > + for (i = 0; i < MLXBF_TM_MAX_IRQ; i++) {
> > > + irq = fifo->irq_info[i].irq;
> >
> > > + if (irq) {
> >
> > I don't think this check is needed if you can guarantee that it has no
> > staled records.
>
> Yes, it's not needed any more according to the current code.
> Will remove it in v11.
>
> >
> > > + fifo->irq_info[i].irq = 0;
> > > + disable_irq(irq);
> > > + }
> > > + }
> > > +}
> >
> > > +/* Get the number of available words in the TmFifo for sending. */
> > > +static int mlxbf_tmfifo_get_tx_avail(struct mlxbf_tmfifo *fifo, int vdev_id)
> > > +{
> > > + int tx_reserve;
> > > + u64 sts;
> > > +
> > > + /* Reserve some room in FIFO for console messages. */
> > > + if (vdev_id == VIRTIO_ID_NET)
> > > + tx_reserve = fifo->tx_fifo_size / MLXBF_TMFIFO_RESERVE_RATIO;
> > > + else
> > > + tx_reserve = 1;
> > > +
> > > + sts = readq(fifo->tx_base + MLXBF_TMFIFO_TX_STS);
> >
> > > + return (fifo->tx_fifo_size - tx_reserve -
> > > + FIELD_GET(MLXBF_TMFIFO_TX_STS__COUNT_MASK, sts));
> >
> > Redundant parens.
> > Moreover, consider
> >
> > u32 count; // or whatever suits for FIELD_GET().
> > ...
> >
> > sts = readq(...);
> > count = FIELD_GET(...);
> > return ...;
>
> Will update in v11.
>
> >
> > > +}
> >
> > > + while (size > 0) {
> > > + addr = cons->tx_buf + cons->tx_head;
> > > +
> > > + if (cons->tx_head + sizeof(u64) <=
> > > + MLXBF_TMFIFO_CONS_TX_BUF_SIZE) {
> > > + memcpy(&data, addr, sizeof(u64));
> > > + } else {
> > > + partial = MLXBF_TMFIFO_CONS_TX_BUF_SIZE - cons->tx_head;
> > > + memcpy(&data, addr, partial);
> >
> > > + memcpy((u8 *)&data + partial, cons->tx_buf,
> > > + sizeof(u64) - partial);
> >
> > Unaligned access?!
>
> The code here is to build and copy 8 bytes from the buffer into the 'data'
> variable. The source could be unaligned. For example, 3 bytes are at the
> end of the buffer and 5 bytes are at the beginning of the buffer. memcpy()
> is used to do byte-stream copy which seems ok. Please correct me if
> I misunderstand the comment.
>
> >
> > > + }
> >
> > > + buf.data = readq(fifo->rx_base + MLXBF_TMFIFO_RX_DATA);
> > > + buf.data = le64_to_cpu(buf.data_le);
> >
> > Are you sure this is correct?
> > How did you test this on BE architectures?
>
> Thanks for the comment! Same as above, the conversion is not really needed.
> I'll remove them in v11. As for testing, we only have arm64 little-endian Linux
> running on the SoC. This conversion doesn't make much difference for the SoC.
> As for BE architecture, we mainly verify the other side of the FIFO, which is the
> external host like using ppc64.
>
> >
> > > + tm_vdev = devm_kzalloc(dev, sizeof(*tm_vdev), GFP_KERNEL);
> >
> > Is it appropriate use of devm_* ?
>
> This is SoC, the device won't be closed or detached. The only case is when
> the driver is unloaded. So it appears ok to use devm_kzalloc() since it's
> allocated during probe() and released during module unload . Please
> correct me if I misunderstand it.
>
> >
> > > + if (!tm_vdev) {
> > > + ret = -ENOMEM;
> > > + goto fail;
> > > + }
> >
> > > +/* Read the configured network MAC address from efi variable. */
> > > +static void mlxbf_tmfifo_get_cfg_mac(u8 *mac)
> > > +{
> > > + efi_guid_t guid = EFI_GLOBAL_VARIABLE_GUID;
> > > + efi_status_t status;
> > > + unsigned long size;
> >
> > > + u8 buf[6];
> >
> > ETH_ALEN ?
>
> Will update it in v11
>
> >
> > > +
> > > + size = sizeof(buf);
> >
> > Ditto.
>
> Will update it in v11
>
> >
> > > + status = efi.get_variable(mlxbf_tmfifo_efi_name, &guid, NULL, &size,
> > > + buf);
> >
> > > + if (status == EFI_SUCCESS && size == sizeof(buf))
> >
> > Ditto.
>
> Will update it in v11
>
> >
> > > + memcpy(mac, buf, sizeof(buf));
> >
> > ether_addr_copy().
>
> Will update it in v11
>
> >
> > > +}
> >
> > > + memcpy(net_config.mac, mlxbf_tmfifo_net_default_mac, 6);
> >
> > ether_addr_copy()...
> >
> > > + mlxbf_tmfifo_get_cfg_mac(net_config.mac);
> >
> > ... but actually above should be part of this function.
>
> Will update it in v11
>
> >
> > --
> > With Best Regards,
> > Andy Shevchenko