Re: [PATCH v2 00/10] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset

From: Koichiro Den

Date: Fri Feb 27 2026 - 04:01:19 EST


On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> This series fixes doorbell bit/vector handling for the EPF-based NTB
> pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> without changing the on-the-wire doorbell mapping.
>
>
> Background / problem
> ====================
>
> ntb_hw_epf historically applies an extra offset when ringing peer
> doorbells: the link event uses the first interrupt slot, and doorbells
> start from the third slot (i.e. a second slot is effectively unused).
> pci-epf-vntb carries the matching offset on the EP side as well.
>
> As long as db_vector_count()/db_vector_mask() are not implemented, this
> mismatch is mostly masked. Doorbell events are effectively treated as
> "can hit any QP" and the off-by-one vector numbering does not surface
> clearly.
>
> However, once per-vector handling is enabled, the current state becomes
> problematic:
>
> - db_valid_mask exposes bits that do not correspond to real doorbells
> (link/unused slots leak into the mask).
> - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> expects a 0-based db_vector for doorbells.
> - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> it directly calls pci_epc_raise_irq(), which can sleep.
>
>
> Why NOT fix the root offset?
> ============================
>
> The natural "root" fix would be to remove the historical extra offset in
> the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> Unfortunately this would lead to interoperability issues when mixing old
> and new kernel versions (old/new peers). A new side would ring a
> different interrupt slot than what an old peer expects, leading to
> missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> are implemented.
>
> Therefore this series intentionally keeps the legacy offset, and instead
> fixes the surrounding pieces so the mapping is documented and handled
> consistently in masks, vector numbering, and per-vector reporting.
>
>
> What this series does
> =====================
>
> - pci-epf-vntb:
>
> - Document the legacy offset.
> - Defer MSI doorbell raises to process context to avoid sleeping in
> atomic context. This becomes relevant once multiple doorbells are
> raised concurrently at a high rate.
> - Report doorbell vectors as 0-based to ntb_db_event().
> - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
>
> - ntb_hw_epf:
>
> - Document the legacy offset in ntb_epf_peer_db_set().
> - Fix db_valid_mask to cover only real doorbell bits.
> - Report 0-based db_vector to ntb_db_event() (accounting for the
> unused slot).
> - Keep db_val as a bitmask and fix db_read/db_clear semantics
> accordingly.
> - Implement db_vector_count()/db_vector_mask().
>
>
> Compatibility
> =============
>
> By keeping the legacy offset intact, this series aims to remain
> compatible across mixed kernel versions. The observable changes are
> limited to correct mask/vector reporting and safer execution context
> handling.
>
> Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> applied separately for each tree. I am sending them together in this
> series to provide the full context and to make the cross-subsystem
> compatibility constraints explicit. Ideally the whole series would be
> applied in a single tree, but each subset is safe to merge on its own.
>
> - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> jiffies from pci_epf_mhi_edma_{read/write}")
>
> - Patch 6-10 can apply cleanly onto ntb-next latest:
> 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> path")
>
> Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> (not vNTB) bridge scenario, but I believe no changes are needed in
> pci-epf-ntb.c.
>
>
> Changelog
> =========
>
> Changes since v1:
> - Addressed feedback from Dave (add a source code comment, introduce
> enum to eliminate magic numbers)
> - Updated source code comment in Patch 2.
> - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> Thank you both for the review.

Sorry, I accidentally used an incorrect series title.
The correct subject should be:

[PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset

For reference, v1 is:
https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@xxxxxxxxxxxxx/

Best regards,
Koichiro

>
>
> Best regards,
>
>
> Koichiro Den (10):
> PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
> PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic
> context
> PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via
> ntb_db_event()
> PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
> PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for
> doorbells
> NTB: epf: Document legacy doorbell slot offset in
> ntb_epf_peer_db_set()
> NTB: epf: Make db_valid_mask cover only real doorbell bits
> NTB: epf: Report 0-based doorbell vector via ntb_db_event()
> NTB: epf: Fix doorbell bitmask handling in db_read/db_clear
> NTB: epf: Implement db_vector_count/mask for doorbells
>
> drivers/ntb/hw/epf/ntb_hw_epf.c | 89 ++++++++++-
> drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++---
> 2 files changed, 210 insertions(+), 26 deletions(-)
>
> --
> 2.51.0
>
>