Re: [PATCH v5 0/5] PCI: endpoint: pci-epf-*ntb: Harden vNTB resource management

From: Koichiro Den

Date: Tue Mar 03 2026 - 22:10:55 EST


On Thu, Feb 26, 2026 at 05:41:37PM +0900, Koichiro Den wrote:
> The vNTB endpoint function (pci-epf-vntb) can be configured and
> reconfigured through configfs (link/unlink functions, start/stop the
> controller, update parameters). In practice, several pitfalls present:
> duplicate EPC teardown that leads to oopses, a work item running after
> resources were torn down, and inability to re-link/restart fundamentally
> because ntb_dev was embedded and the vPCI bus teardown was incomplete.
>
> This series addresses those issues and hardens resource management of
> pci-epf-vntb:
>
> - Remove duplicate EPC resource teardown in both pci-epf-vntb and
> pci-epf-ntb, avoiding crashes on .allow_link failures and during
> .drop_link.
> - Stop the delayed cmd_handler work before clearing BARs/doorbells.
> - Manage ntb_dev as a devm-managed allocation and implement .remove() in
> the vNTB PCI driver. Switch to pci_scan_root_bus().
>
> With these changes, the controller can now be stopped, a function
> unlinked, configfs settings updated, and the controller re-linked and
> restarted without rebooting the endpoint, as long as the underlying
> pci_epc_ops .stop() is non-destructive and .start() restores normal
> operation.
>
> Patches 1-3 carry Fixes tags and are candidates for stable.
> Patch 4 is a preparatory one for Patch 5.
> Patch 5 is a behavioral improvement that completes lifetime management for
> relink/restart scenarios.

While I'm updating Patch 4 and 5 to address feedback from Mani, as well as the
concern I mentioned at [1], I noticed that if [2] gets merged before this
series, another issue may arrise. With [2], the DB IRQ may become a shared IRQ,
in which case the unbind/remove race would require additional care.


Mani, if it's ok, could you take Patch 1-3?

- If so, I'll spin the rest (Patch 4-5) into a separate patch series starting
from v6, with some additional commits.

It turns out that Patch 4-5 are a bigger change than I initially thought. Even
though Patch 1-3 were originally written as preparatory fixes, they can be
applied independently at any time.

The code in Patches 1-3 has also been unchanged since v1 (submitted last
October).

[1] https://lore.kernel.org/linux-pci/mipdls67csyyrugf4rjx3qqtbxes4sjjtluy3psecnadcgcs7k@rn42d3m6ggsf/
[2] [PATCH v10 0/7] PCI: endpoint: pci-ep-msi: Add embedded doorbell fallback
https://lore.kernel.org/linux-pci/20260302071427.534158-1-den@xxxxxxxxxxxxx/


Best regards,
Koichiro

>
> ---
> v4->v5 changes:
> - Rebased onto the latest pci/endpoint (2026-02-26).
> - Dropped [PATCH v4 1/7]; will be reposted separately via the NTB tree.
> - Dropped [PATCH v4 2/7], which has been applied in a different form.
> - Corrected the subject prefix of [PATCH v4 5/7]:
> s/NTB: epf: vntb:/PCI: endpoint: pci-epf-vntb:/.
> - Picked up a Reviewed-by tag to [PATCH v4 7/7].
> - Resolved a conflict in [PATCH v4 7/7] due to commit
> dc693d606644 ("PCI: endpoint: pci-epf-vntb: Add MSI doorbell support").
> v3->v4 changes:
> - Added Reviewed-by tag for [PATCH v3 6/6].
> - Corrected patch split by moving the blank-line cleanup,
> based on the feedback from Frank.
> (No code changes overall.)
> v2->v3 changes:
> - Added Reviewed-by tag for [PATCH v2 4/6].
> - Split [PATCH v2 6/6] into two, based on the feedback from Frank.
> (No code changes overall.)
> v1->v2 changes:
> - Incorporated feedback from Frank.
> - Added Reviewed-by tags (except for patches #4 and #6).
> - Fixed a typo in patch #5 title.
> (No code changes overall.)
>
> v4: https://lore.kernel.org/linux-pci/20251202072348.2752371-1-den@xxxxxxxxxxxxx/
> v3: https://lore.kernel.org/all/20251130151100.2591822-1-den@xxxxxxxxxxxxx/
> v2: https://lore.kernel.org/all/20251029080321.807943-1-den@xxxxxxxxxxxxx/
> v1: https://lore.kernel.org/all/20251023071757.901181-1-den@xxxxxxxxxxxxx/
>
>
> Koichiro Den (5):
> PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
> PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
> PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in
> epf_ntb_epc_cleanup
> PCI: endpoint: pci-epf-vntb: Switch vpci_scan_bus() to use
> pci_scan_root_bus()
> PCI: endpoint: pci-epf-vntb: manage ntb_dev lifetime and fix vpci bus
> teardown
>
> drivers/pci/endpoint/functions/pci-epf-ntb.c | 56 +-----------
> drivers/pci/endpoint/functions/pci-epf-vntb.c | 88 ++++++++++++-------
> 2 files changed, 57 insertions(+), 87 deletions(-)
>
> --
> 2.51.0
>
>