Re: [PATCH v11 07/14] irqchip: Add RISC-V incoming MSI controller early driver

From: Björn Töpel
Date: Thu Oct 26 2023 - 04:51:13 EST


Hi Anup,

I'm getting the vibes that you are upset. Just to clarify; I want AIA
support as much as the next guy. I'm not here to pick fights, and argue
for non-technical things. I'm just here for
questions/clarifications/suggestion, so we can move the implementation
forward.

If I for some reason offended you, please let me know. If that was the
case, that was not on purpose, and accept my apologies.

Now, please let's continue the technical discussion.

Anup Patel <apatel@xxxxxxxxxxxxxxxx> writes:

>> >> > +
>> >> > + writel(IMSIC_IPI_ID, local->msi_va);
>> >>
>> >> Do you need the barriers here? If so, please document. If not, use the
>> >> _releaxed() version.
>> >
>> > We can't assume that _relaxed version of MMIO operations
>> > will work for RISC-V implementation so we conservatively
>> > use regular MMIO operations without _releaxed().
>>
>> You'll need to expand on your thinking here, Anup. We can't just
>> sprinkle fences everywhere because of "we can't assume it'll work". Do
>> you need proper barriers for IPIs or not?
>
> For IPIs, we use generic IPI-mux which has its own barriers. We
> certainly need matching read and write barrier for the data being
> passed for synchronization.

Ok! If the IPI-mux has the barriers, it seems like a writel_relaxed will
do just fine.

>> >> > +void imsic_vector_mask(struct imsic_vector *vec)
>> >> > +{
>> >> > + struct imsic_local_priv *lpriv;
>> >> > + unsigned long flags;
>> >> > +
>> >> > + lpriv = per_cpu_ptr(imsic->lpriv, vec->cpu);
>> >> > + if (WARN_ON(&lpriv->vectors[vec->local_id] != vec))
>> >> > + return;
>> >> > +
>> >> > + raw_spin_lock_irqsave(&lpriv->ids_lock, flags);
>> >> > + bitmap_clear(lpriv->ids_enabled_bitmap, vec->local_id, 1);
>> >> > + raw_spin_unlock_irqrestore(&lpriv->ids_lock, flags);
>> >> > +
>> >> > + imsic_remote_sync(vec->cpu);
>> >>
>> >> x86 seems to set a timer instead, for the remote cpu cleanup, which can
>> >> be much cheaper, and less in instrusive. Is that applicable here?
>> >
>> > The issue with that approach is deciding the right duration
>> > of timer interrupt. There might be platforms who need
>> > immediate mask/unmask response. We can certainely
>> > keep improving/tuning this over-time.
>>
>> Any concrete examples where this is an actual problem?
>
> Do you have a concrete timer duration with proper justification ?

I would simply mimic what x86 does for now -- jiffies + 1.

No biggie for me, and this can, as you say, be improved later.

>> >> > +void imsic_vector_move(struct imsic_vector *old_vec,
>> >> > + struct imsic_vector *new_vec)
>> >> > +{
>> >> > + struct imsic_local_priv *old_lpriv, *new_lpriv;
>> >> > + struct imsic_vector *ovec, *nvec;
>> >> > + unsigned long flags, flags1;
>> >> > + unsigned int i;
>> >> > +
>> >> > + if (WARN_ON(old_vec->cpu == new_vec->cpu ||
>> >> > + old_vec->order != new_vec->order ||
>> >> > + (old_vec->local_id & IMSIC_VECTOR_MASK(old_vec)) ||
>> >> > + (new_vec->local_id & IMSIC_VECTOR_MASK(new_vec))))
>> >> > + return;
>> >> > +
>> >> > + old_lpriv = per_cpu_ptr(imsic->lpriv, old_vec->cpu);
>> >> > + if (WARN_ON(&old_lpriv->vectors[old_vec->local_id] != old_vec))
>> >> > + return;
>> >> > +
>> >> > + new_lpriv = per_cpu_ptr(imsic->lpriv, new_vec->cpu);
>> >> > + if (WARN_ON(&new_lpriv->vectors[new_vec->local_id] != new_vec))
>> >> > + return;
>> >> > +
>> >> > + raw_spin_lock_irqsave(&old_lpriv->ids_lock, flags);
>> >> > + raw_spin_lock_irqsave(&new_lpriv->ids_lock, flags1);
>> >> > +
>> >> > + /* Move the state of each vector entry */
>> >> > + for (i = 0; i < BIT(old_vec->order); i++) {
>> >> > + ovec = old_vec + i;
>> >> > + nvec = new_vec + i;
>> >> > +
>> >> > + /* Unmask the new vector entry */
>> >> > + if (test_bit(ovec->local_id, old_lpriv->ids_enabled_bitmap))
>> >> > + bitmap_set(new_lpriv->ids_enabled_bitmap,
>> >> > + nvec->local_id, 1);
>> >> > +
>> >> > + /* Mask the old vector entry */
>> >> > + bitmap_clear(old_lpriv->ids_enabled_bitmap, ovec->local_id, 1);
>> >> > +
>> >> > + /*
>> >> > + * Move and re-trigger the new vector entry based on the
>> >> > + * pending state of the old vector entry because we might
>> >> > + * get a device interrupt on the old vector entry while
>> >> > + * device was being moved to the new vector entry.
>> >> > + */
>> >> > + old_lpriv->ids_move[ovec->local_id] = nvec;
>> >> > + }
>> >>
>> >> Hmm, nested spinlocks, and reimplementing what the irq matrix allocator
>> >> does.
>> >>
>> >> Convince me why irq matrix is not a good fit to track the interrupts IDs
>> >> *and* get handling/tracking for managed/unmanaged interrupts. You said
>> >> that it was the power-of-two blocks for MSI, but can't that be enfored
>> >> on matrix alloc? Where are you doing the special handling of MSI?
>> >>
>> >> The reason I'm asking is because I'm pretty certain that x86 has proper
>> >> MSI support (Thomas Gleixner can answer for sure! ;-))
>> >>
>> >> IMSIC smells a lot like the the LAPIC. The implementation could probably
>> >> be *very* close to what arch/x86/kernel/apic/vector.c does.
>> >>
>> >> Am I completly off here?
>> >>
>> >
>> > The x86 APIC driver only supports MSI-X due to which the IRQ matrix
>> > allocator only supports ID/Vector allocation suitable for MSI-X whereas
>> > the ARM GICv3 driver supports both legacy MSI and MSI-X. In absence
>> > of legacy MSI support, Linux x86 will fallback to INTx for PCI devices
>> > with legacy MSI support but for RISC-V platforms we can't assume that
>> > INTx is available because we might be dealing with an IMSIC-only
>> > platform.
>>
>> You're mixing up MSI and *multi-MSI* (multiple MSI vectors).
>
> So now you are doubting my understanding of MSI ?

I'm not doubting anything. Maybe we need to clarify so that we
understand each other.

You said: "The x86 APIC driver only supports MSI-X..." And that made me
think that you didn't have all the details. Sorry for making that
assumption.

Let's clear up the terminology, for our own sake:

* legacy-MSI: MSI (non-MSIX!), with *only one vector*.
* multi-MSI: MSI (non-MSIX!), with multiple vectors
* MSI-X

"MSI" can also refer to all of the above.

x86 supports legacy-MSI and MSI-X for non-remapped MSIs, and multi-MSI
with IOMMU support.

>> x86 support MSI-X, MSI, and multi-MSI with IOMMU.
>>
>> Gleixner has a some insights on why one probably should *not* jump
>> through hoops to support multi-MSI:
>> https://lore.kernel.org/all/877d7yhve7.ffs@tglx/
>
> This is a fair justification to drop why x86 does not support
> the legacy-MSI or "multi-MSI".

My claim is that x86 does support legacy-MSI, but for design decision,
has avoided multi-MSI.

AFAIU, there are few multi-MSI devices out there.

>> Will we really see HW requiring multi-MSI support on RISC-V systems
>> without IOMMU? To me this sounds like a theoretical exercise.
>>
>> > Refer, x86_vector_msi_parent_ops in arch/x86/kernel/apic/msi.c and
>> > X86_VECTOR_MSI_FLAGS_SUPPORTED in arch/x86/include/asm/msi.h
>> >
>> > Refer, its_pci_msi_domain_info in drivers/irqchip/irq-gic-v3-its-pci-msi.c
>> >
>> > The changes which I think are need in the IRQ matrix allocator before
>> > integrating it in the IMSIC driver are the following:
>> > 1) IRQ matrix allocator assumed NR_VECTORS to be a fixed define
>> > which the arch code provides but in RISC-V world the number of
>> > IDs are discovered from DT or ACPI.
>>
>> Ok, let's try to be bit more explicit. Have you had a look at
>> kernel/irq/matrix.c?
>
> Why do you doubt it ?

Again, no doubts -- I'm just trying to clarify. Sorry if that touched a
nerve!

>> You need to define the IRQ_MATRIX_BITS (which x86 sets to NR_VECTORS).
>> This is the size of the bitmap. For IMSIC this would be 2047.
>
> Wow, let's just create large bitmaps even when underlying HW has
> fewer per-CPU IDs !!!

Yeah, fair argument. It's a bit too much. Here's a patch to the matrix
allocator that fixes that. Note that it's only compile tested:

--8<--