Re: [PATCH 08/18] soc: qcom: ipa: the generic software interface
From: Alex Elder
Date: Wed May 15 2019 - 08:15:04 EST
On 5/15/19 2:21 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@xxxxxxxxxx> wrote:
>
>> +/** gsi_gpi_channel_scratch - GPI protocol scratch register
>> + *
>> + * @max_outstanding_tre:
>> + * Defines the maximum number of TREs allowed in a single transaction
>> + * on a channel (in Bytes). This determines the amount of prefetch
>> + * performed by the hardware. We configure this to equal the size of
>> + * the TLV FIFO for the channel.
>> + * @outstanding_threshold:
>> + * Defines the threshold (in Bytes) determining when the sequencer
>> + * should update the channel doorbell. We configure this to equal
>> + * the size of two TREs.
>> + */
>> +struct gsi_gpi_channel_scratch {
>> + u64 rsvd1;
>> + u16 rsvd2;
>> + u16 max_outstanding_tre;
>> + u16 rsvd3;
>> + u16 outstanding_threshold;
>> +} __packed;
>> +
>> +/** gsi_channel_scratch - channel scratch configuration area
>> + *
>> + * The exact interpretation of this register is protocol-specific.
>> + * We only use GPI channels; see struct gsi_gpi_channel_scratch, above.
>> + */
>> +union gsi_channel_scratch {
>> + struct gsi_gpi_channel_scratch gpi;
>> + struct {
>> + u32 word1;
>> + u32 word2;
>> + u32 word3;
>> + u32 word4;
>> + } data;
>> +} __packed;
>
> What are the exact alignment requirements on these structures,
> do you ever need to have them on odd addresses? If not, please
> remove the __packed, or add __aligned() with the actual alignment,
> e.g. __aligned(4), to let the compiler create better code and
> avoid bytewise accesses.
Honestly I don't know but I would guess they've actually
got alignment requirements consistent with C standard...
Many, many structures had the __packed attribute attached
in the original code. I removed most but apparently not
all. I will remove the __packed here, and will scan through
the rest of the code for other similar instances and will
remove those if appropriate as well.
>> +/* Init function for GSI. GSI hardware does not need to be "ready" */
>> +int gsi_init(struct gsi *gsi, struct platform_device *pdev, u32 data_count,
>> + const struct gsi_ipa_endpoint_data *data)
>> +{
>> + struct resource *res;
>> + resource_size_t size;
>> + unsigned int irq;
>> + int ret;
>> +
>> + gsi->dev = &pdev->dev;
>> + init_dummy_netdev(&gsi->dummy_dev);
>
> Can you add a comment here to explain what the 'dummy' device is
> needed for?
Yes, good idea.
FYI it's needed because the GSI code is not a "real"
network device (that, where needed, is implemented in
"ipa_netdev.c", two logical layers up), but in order
to use NAPI there needs to be one.
>> + /* Get GSI memory range and map it */
>> + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "gsi");
>> + if (!res)
>> + return -ENXIO;
>> +
>> + size = resource_size(res);
>> + if (res->start > U32_MAX || size > U32_MAX - res->start)
>> + return -EINVAL;
>> +
>> + gsi->virt = ioremap_nocache(res->start, size);
>> + if (!gsi->virt)
>> + return -ENOMEM;
>
> The _nocache() postfix is not needed here, and I find it a bit
> confusing, just use plain ioremap, or maybe even
> devm_platform_ioremap_resource() to save the
> platform_get_resource_byname().
OK good idea. This was in the original code and I neglected
to chase this down. Thank you for catching it.
>> + ret = request_irq(irq, gsi_isr, 0, "gsi", gsi);
>> + if (ret)
>> + goto err_unmap_virt;
>> + gsi->irq = irq;
>> +
>> + ret = enable_irq_wake(gsi->irq);
>> + if (ret)
>> + dev_err(gsi->dev, "error %d enabling gsi wake irq\n", ret);
>> + gsi->irq_wake_enabled = ret ? 0 : 1;
>> +
>> + spin_lock_init(&gsi->spinlock);
>> + mutex_init(&gsi->mutex);
>
> This looks a bit dangerous if you can ever get to the point of
> having a pending interrupt. before the structure is fully initialized.
> This can probably not happen in practice, but it's better to request
> the interrupts last to be on the safe side.
Understood. I'll fix that.
>> +/* Wait for all transaction activity on a channel to complete */
>> +void gsi_channel_trans_quiesce(struct gsi *gsi, u32 channel_id)
>> +{
>> + struct gsi_channel *channel = &gsi->channel[channel_id];
>> + struct gsi_trans_info *trans_info;
>> + struct gsi_trans *trans = NULL;
>> + struct gsi_evt_ring *evt_ring;
>> + struct list_head *list;
>> + unsigned long flags;
>> +
>> + trans_info = &channel->trans_info;
>> + evt_ring = &channel->gsi->evt_ring[channel->evt_ring_id];
>> +
>> + spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
>> +
>> + /* Find the last list to which a transaction was added */
>> + if (!list_empty(&trans_info->alloc))
>> + list = &trans_info->alloc;
>> + else if (!list_empty(&trans_info->pending))
>> + list = &trans_info->pending;
>> + else if (!list_empty(&trans_info->complete))
>> + list = &trans_info->complete;
>> + else if (!list_empty(&trans_info->polled))
>> + list = &trans_info->polled;
>> + else
>> + list = NULL;
>> +
>> + if (list) {
>> + struct gsi_trans *trans;
>> +
>> + /* The last entry on this list is the last one allocated.
>> + * Grab a reference so we can wait for it.
>> + */
>> + trans = list_last_entry(list, struct gsi_trans, links);
>> + refcount_inc(&trans->refcount);
>> + }
>> +
>> + spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
>> +
>> + /* If there is one, wait for it to complete */
>> + if (trans) {
>> + wait_for_completion(&trans->completion);
>
> Since you are waiting here, you clearly can't be called
> from interrupt context, or with interrupts disabled, so it's
> clearer to use spin_lock_irq() instead of spin_lock_irqsave().
>
> I generally try to avoid the _irqsave versions altogether, unless
> it is really needed for a function that is called both from
> irq-disabled and irq-enabled context.
OK. And I appreciate what your saying here because I do prefer
code that communicates more about the context in ways like
you describe.
Thanks you.
-Alex
>
> Arnd
>