[PATCH 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
From: Koichiro Den
Date: Tue Feb 24 2026 - 08:36:45 EST
This series fixes doorbell bit/vector handling for the EPF-based NTB
pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
without changing the on-the-wire doorbell mapping.
Background / problem
====================
ntb_hw_epf historically applies an extra offset when ringing peer
doorbells: the link event uses the first interrupt slot, and doorbells
start from the third slot (i.e. a second slot is effectively unused).
pci-epf-vntb carries the matching offset on the EP side as well.
As long as db_vector_count()/db_vector_mask() are not implemented, this
mismatch is mostly masked. Doorbell events are effectively treated as
"can hit any QP" and the off-by-one vector numbering does not surface
clearly.
However, once per-vector handling is enabled, the current state becomes
problematic:
- db_valid_mask exposes bits that do not correspond to real doorbells
(link/unused slots leak into the mask).
- ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
expects a 0-based db_vector for doorbells.
- On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
it directly calls pci_epc_raise_irq(), which can sleep.
Why NOT fix the root offset?
============================
The natural "root" fix would be to remove the historical extra offset in
the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
Unfortunately this would lead to interoperability issues when mixing old
and new kernel versions (old/new peers). A new side would ring a
different interrupt slot than what an old peer expects, leading to
missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
are implemented.
Therefore this series intentionally keeps the legacy offset, and instead
fixes the surrounding pieces so the mapping is documented and handled
consistently in masks, vector numbering, and per-vector reporting.
What this series does
=====================
- pci-epf-vntb:
- Document the legacy offset.
- Defer MSI doorbell raises to process context to avoid sleeping in
atomic context. This becomes relevant once multiple doorbells are
raised concurrently at a high rate.
- Report doorbell vectors as 0-based to ntb_db_event().
- Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
- ntb_hw_epf:
- Document the legacy offset in ntb_epf_peer_db_set().
- Fix db_valid_mask to cover only real doorbell bits.
- Report 0-based db_vector to ntb_db_event() (accounting for the
unused slot).
- Keep db_val as a bitmask and fix db_read/db_clear semantics
accordingly.
- Implement db_vector_count()/db_vector_mask().
Compatibility
=============
By keeping the legacy offset intact, this series aims to remain
compatible across mixed kernel versions. The observable changes are
limited to correct mask/vector reporting and safer execution context
handling.
Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
applied separately for each tree. I am sending them together in this
series to provide the full context and to make the cross-subsystem
compatibility constraints explicit. Ideally the whole series would be
applied in a single tree, but each subset is safe to merge on its own.
Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
(not vNTB) bridge scenario, but I believe no changes are needed in
pci-epf-ntb.c.
Development based on: v7.0-rc1
Best regards,
Koichiro Den (10):
PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic
context
PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via
ntb_db_event()
PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for
doorbells
NTB: epf: Document legacy doorbell slot offset in
ntb_epf_peer_db_set()
NTB: epf: Make db_valid_mask cover only real doorbell bits
NTB: epf: Report 0-based doorbell vector via ntb_db_event()
NTB: epf: Fix doorbell bitmask handling in db_read/db_clear
NTB: epf: Implement db_vector_count/mask for doorbells
drivers/ntb/hw/epf/ntb_hw_epf.c | 79 +++++++++-
drivers/pci/endpoint/functions/pci-epf-vntb.c | 143 +++++++++++++++---
2 files changed, 196 insertions(+), 26 deletions(-)
--
2.51.0