Re: [PATCH v14 4/7] PCI: endpoint: pci-ep-msi: Refactor doorbell allocation for new backends
From: Max Boone
Date: Wed Apr 29 2026 - 07:11:44 EST
> On Apr 29, 2026, at 11:33 AM, Niklas Cassel <cassel@xxxxxxxxxx> wrote:
>
> On Tue, Apr 28, 2026 at 10:36:15PM +0200, Max Boone wrote:
>>
>> I’m not very fond of keeping this implementation in the pci-ep-msi file,
>> as the platform MSI and this implementation are both iiuc specific to
>> the designware ep driver. Even more so because the MSI implementation
>> is enabled by config rather than through device tree.
>
> Why do you think that the current code with DOMAIN_BUS_PLATFORM_MSI is
> designware EPC specific?
>
> I don't see anything that is designware EPC specific.
>
> Sure, it relies on GIC ITS, but I don't see why non-designware EPCs can't
> use GIC ITS.
Good point - looking through the device trees I only saw the msi-map / platform
msi set for the imx95 and assumed designware was the only EPC supporting
this (also because the code uses that of-node specifically), but I indeed don’t see
a reason that other chips can’t use this.
I’m a bit confused on the configuration, fwiw it’s probably me being unfamiliar
with PCIe, but it doesn’t seem right to configure the MSI and eDMA DBs through
a kconfig option rather than inferring it from the device tree and/or having the EP
driver enable the capability and expose an operation to realize it.
On the other hand, that way, we would probably end up with identical DB
implementations in mutliple drivers.
>> Wouldn’t we want end-users to specify what kind of doorbell they want,
>> as it seems to be that a more specific doorbell BAR layout can be
>> programmed with eDMA, allowing native support for nvmet’s doorbell
>> BAR for example.
>
> I also wanted to use my designware based EPC for doorbells in nvmet-pci-epf,
> specifically the support that Koichiro added for inbound subrange mapping.
>
> However, most designware EPC have a strict alignment requirement
> (CX_ATU_MIN_REGION_SIZE), which is often 4k.
>
> This alignment requirement is there both on the PCI address (address within
> the BAR, and for the physical memory address (target address)).
>
> I thought that we could use the inbound subrange mapping and put the doorbells
> in a separate inbound iATU, so we could remove polling in the nvmet-pci-epf
> driver, just like they have done in the vNTB driver. However, it works in vNTB
> because they have a register telling exactly which BAR and offset in that BAR
> where the doorbells are.
> In the NVMe PCIe Transport specification, the offset for the start of the
> doorbells is fixed, at offset 0x1000 (4k) and the only thing you can change is
> the stride between the doorbells.
>
> Currently, a doorbell is a single 32-bit data, sure we could call
> pci_epf_alloc_doorbell() with ("number of I/O queues" + 1 (admin queue)) * 2
> (submission queue and completion queue).
>
> However, the address which we get from pci_epf_alloc_doorbell() might not be
> 4k aligned.
>
> We have the function pci_epf_align_inbound_addr() which can split this
> non-aligned address to a 4k aligned base + offset from that base.
>
> However, that would also require the host side driver to write to this offset
> from the start address. (See e.g. doorbell_offset in pci-epf-test.c).
>
> So, basically, with the current limitation that the doorbells must start at
> 0x1000, together with the fact that the doorbells returned from
> pci_epf_alloc_doorbell() might have an arbitrary alignment, I don't see how
> we could add support for doorbells in nvmet-pci-epf.
>
> If we could supply an alignment requirement to pci_epf_alloc_doorbell(),
> e.g. 4k, and the API is guaranteed to return an address that satisfies this
> alignment requirement, then we would be good.
>
> However, right now, we don't have such an API. We simple get an address
> somewhere within the GIC ITS MMIO region.
Check, thanks for the write-up, this is also what I’m looking to get working,
coindicentally on the RK3588. I had imagined that it would be possible to build
a sufficient API by passing in a base offset and stride for the doorbell allocation,
but an alignment param sounds better. Can we program the resulting doorbells
at an arbitrary offset in a BAR, or would we waste the first allocated
doorbell that’s going to be located at 0x0000 - 0x1000?
In any case, I think it would be preferable for users of the alloc_doorbell function
to pass in what kind of doorbell they want instead of using a fallback mechanism.
It seems to me that the alignment and possibly a larger amount of doorbells are
possible with the eDMA doorbell mechanism. Or am I misunderstanding eDMA
here and is that bounded by mapping / size / alignment of the GIC ITS?
>> Originally in a patchset by Frank Li the API that was proposed was more
>> generic, and the pci-epc-msi implementation was chosen because there
>> was only one implementation:
>> - https://lore.kernel.org/imx/20231019150441.GA7254@thinkpad/
>> - https://lore.kernel.org/imx/20231019172347.GC7254@thinkpad/
>>
>> I’d personally prefer to see an abstraction that is weaved into pci-epc-core
>> and pci-epf-core that can be implemented by drivers as they wish. While
>> still keeping the enum for different types.
>>
>> That also gives room to pull a poll-mode doorbell into the pci-epc-core,
>> which deduplicates that code from the nvmet and vntb epfs, and allows
>> other functions to use RC->EP doorbells without needing to bother with
>> writing the polling mechanism.
>
> Sounds like a good idea.
I’ll refactor my local branch and include this patchset and send it in with RFC,
will probably not work on this for another couple days though.
>> P.S. I’ve been working on a vfio-user based epc for development purposes
>> personally, and the last hurdle before I want to send it in for comments is
>> support for doorbells, and came across this patchset checking if there’s
>> any other activity in the space. Having an implementation-agnostic
>> doorbell API in the EPF/EPC core would be very helpful to me.
>
> I have looked at adding doorbell support to nvmet-pci-epf, but got stuck on
> pci_epf_alloc_doorbell() returning an address that is not 4k aligned.
>
> (Since the NVMe PCIe transport specification has the doorbells at a fixed
> location, we can't change that.)
>
> But if we could provide an "alignment" parameter to pci_epf_alloc_doorbell(),
> then I think it is possible.
>
> Sure, the GIC ITS MMIO area might be quite small, so it might not be able
> satisfy such a request. E.g. on rk3588, the its1 MMIO region is 0x20000 (128k):
> https://github.com/torvalds/linux/blob/v7.1-rc1/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi#L2414
Hrm, I think I’m misunderstanding the eDMA mechanism that is proposed in this
patch. Is the fixed eDMA register block (e.g. BAR4 for the RK3588) translated to
a space in the GIC ITS MMIO area - or is restriction specifically on adding alignment
to the platform MSI doorbell implementation?
> However, I have not idea of how much of this region the GIC driver uses for
> actual registers, and how much of that region it can actually dedicate to
> doorbells.
>
>
> Kind regards,
> Niklas