Re: [PATCH v4 5/7] iommu/dma: Make limit checks self-contained
From: Jens Glathe
Date: Mon Jun 03 2024 - 15:54:22 EST
Hi Robin,
an observation from 6.10-rc1: On sc8280xp (Lenovo X13s, Windows Dev Kit
2023), when booted to EL2 with the arm-smmuv3 under control of Linux, it
fails to set up DMA transfers to nvme. My box boots from nvme, so I only
got a black screen. @craftyguy booted from USB, and got this:
[ 0.008641] CPU: All CPU(s) started at EL2
...
[ 1.475359] nvme 0002:01:00.0: Adding to iommu group 5
[ 1.476346] nvme nvme0: pci function 0002:01:00.0
[ 1.477134] nvme 0002:01:00.0: enabling device (0000 -> 0002)
[ 1.478457] ------------[ cut here ]------------
[ 1.479233] WARNING: CPU: 5 PID: 95 at drivers/iommu/io-pgtable-arm.c:304 __arm_lpae_map+0x2d0/0x3f0
[ 1.480040] Modules linked in: pcie_qcom phy_qcom_qmp_pcie nvme nvme_core
[ 1.480858] CPU: 5 PID: 95 Comm: kworker/u32:4 Not tainted 6.10.0-rc1 #1-lenovo-21bx
[ 1.481669] Hardware name: LENOVO 21BXCTO1WW/21BXCTO1WW, BIOS N3HET88W (1.60 ) 03/14/2024
[ 1.482483] Workqueue: async async_run_entry_fn
[ 1.483309] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1.484136] pc : __arm_lpae_map+0x2d0/0x3f0
[ 1.484963] lr : __arm_lpae_map+0x128/0x3f0
[ 1.485789] sp : ffff80008116b2e0
[ 1.486613] x29: ffff80008116b2f0 x28: 0000000000000001 x27: ffff591c40fb9ff8
[ 1.487447] x26: 0000000000000001 x25: 0000000000001000 x24: 00000000fffff000
[ 1.488285] x23: 0000000000000003 x22: 00000001038c8000 x21: 0000000000000f44
[ 1.489117] x20: 0000000000000001 x19: ffff591c4947bd80 x18: ffffffffffffffff
[ 1.489944] x17: ffff591c4945bc00 x16: ffffc0bbfadf8768 x15: ffff591c4389e000
[ 1.490771] x14: 0000000000000000 x13: ffff591c4947bd80 x12: ffff80008116b540
[ 1.491599] x11: 0000000000000dc0 x10: 0000000000000001 x9 : 0000000000000009
[ 1.492410] x8 : 00000000000001ff x7 : 0000000000000000 x6 : 0000000000000002
[ 1.493197] x5 : 0000000000000003 x4 : 0000000000000001 x3 : 0000000000000001
[ 1.493978] x2 : 0000000000000003 x1 : 0000000000000000 x0 : ffff591c4947bd80
[ 1.494771] Call trace:
[ 1.495541] __arm_lpae_map+0x2d0/0x3f0
[ 1.496320] __arm_lpae_map+0x128/0x3f0
[ 1.497091] __arm_lpae_map+0x128/0x3f0
[ 1.497861] __arm_lpae_map+0x128/0x3f0
[ 1.498620] arm_lpae_map_pages+0xfc/0x1d0
[ 1.499377] arm_smmu_map_pages+0x24/0x40
[ 1.500132] __iommu_map+0x134/0x2c8
[ 1.500888] iommu_map_sg+0xc0/0x1d0
[ 1.501640] __iommu_dma_alloc_noncontiguous.isra.0+0x2d8/0x4c0
[ 1.502403] iommu_dma_alloc+0x25c/0x3c8
[ 1.503162] dma_alloc_attrs+0x100/0x110
[ 1.503919] nvme_alloc_queue+0x6c/0x170 [nvme]
[ 1.504684] nvme_pci_enable+0x228/0x518 [nvme]
[ 1.505438] nvme_probe+0x290/0x6d8 [nvme]
[ 1.506188] local_pci_probe+0x48/0xb8
[ 1.506937] pci_device_probe+0xb0/0x1d8
[ 1.507683] really_probe+0xc8/0x3a0
[ 1.508433] __driver_probe_device+0x84/0x170
[ 1.509182] driver_probe_device+0x44/0x120
[ 1.509930] __device_attach_driver+0xc4/0x168
[ 1.510675] bus_for_each_drv+0x90/0xf8
[ 1.511434] __device_attach+0xa8/0x1c8
[ 1.512183] device_attach+0x1c/0x30
[ 1.512927] pci_bus_add_device+0x6c/0xe8
[ 1.513653] pci_bus_add_devices+0x40/0x98
[ 1.514357] pci_bus_add_devices+0x6c/0x98
[ 1.515058] pci_host_probe+0x4c/0xd0
[ 1.515756] dw_pcie_host_init+0x250/0x660
[ 1.516452] qcom_pcie_probe+0x234/0x320 [pcie_qcom]
[ 1.517155] platform_probe+0x70/0xd8
[ 1.517854] really_probe+0xc8/0x3a0
[ 1.518543] __driver_probe_device+0x84/0x170
[ 1.519230] driver_probe_device+0x44/0x120
[ 1.519914] __driver_attach_async_helper+0x58/0x100
[ 1.520603] async_run_entry_fn+0x3c/0x160
[ 1.521295] process_one_work+0x160/0x3f0
[ 1.521991] worker_thread+0x304/0x420
[ 1.522666] kthread+0x118/0x128
[ 1.523318] ret_from_fork+0x10/0x20
[ 1.523968] ---[ end trace 0000000000000000 ]---
[ 1.524788] nvme 0002:01:00.0: probe with driver nvme failed with error -12
From bisecting this I landed at this patchset, and I had to revert it
to make it work again on 6.10-rc1. From what I've seen the issue appears
to be coming from of_dma_configure_id(), where instead of the base and
size parameters a map is given now somewhere in the device structure for
arch_setup_dma_ops().
Since this only happens with arm-smmuv3 under Linux control, it could be
either malformed dt or some gap in determining the right map for the dma
from the given parameters.
The additional definition in the dt (sc8280xp.dtsi) for the pcie2a port
is this:
iommu-map = <0 &pcie_smmu 0x20000 0x10000>;
This worked since v6.8. The repository with a working version can be
found here:
https://github.com/jglathe/linux_ms_dev_kit/tree/jg/el2-blackrock-v6.10.y
with best regards
Jens
On 4/19/24 18:54, Robin Murphy wrote:
It's now easy to retrieve the device's DMA limits if we want to check
them against the domain aperture, so do that ourselves instead of
relying on them being passed through the callchain.
Reviewed-by: Jason Gunthorpe<jgg@xxxxxxxxxx>
Tested-by: Hanjun Guo<guohanjun@xxxxxxxxxx>
Signed-off-by: Robin Murphy<robin.murphy@xxxxxxx>
---
drivers/iommu/dma-iommu.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index a3039005b696..f542eabaefa4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -660,19 +660,16 @@ static void iommu_dma_init_options(struct iommu_dma_options *options,
/**
* iommu_dma_init_domain - Initialise a DMA mapping domain
* @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
- * @base: IOVA at which the mappable address space starts
- * @limit: Last address of the IOVA space
* @dev: Device the domain is being initialised for
*
- * @base and @limit + 1 should be exact multiples of IOMMU page granularity to
- * avoid rounding surprises. If necessary, we reserve the page at address 0
+ * If the geometry and dma_range_map include address 0, we reserve that page
* to ensure it is an invalid IOVA. It is safe to reinitialise a domain, but
* any change which could make prior IOVAs invalid will fail.
*/
-static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
- dma_addr_t limit, struct device *dev)
+static int iommu_dma_init_domain(struct iommu_domain *domain, struct device *dev)
{
struct iommu_dma_cookie *cookie = domain->iova_cookie;
+ const struct bus_dma_region *map = dev->dma_range_map;
unsigned long order, base_pfn;
struct iova_domain *iovad;
int ret;
@@ -684,18 +681,18 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
/* Use the smallest supported page size for IOVA granularity */
order = __ffs(domain->pgsize_bitmap);
- base_pfn = max_t(unsigned long, 1, base >> order);
+ base_pfn = 1;
/* Check the domain allows at least some access to the device... */
- if (domain->geometry.force_aperture) {
+ if (map) {
+ dma_addr_t base = dma_range_map_min(map);
if (base > domain->geometry.aperture_end ||
- limit < domain->geometry.aperture_start) {
+ dma_range_map_max(map) < domain->geometry.aperture_start) {
pr_warn("specified DMA range outside IOMMU capability\n");
return -EFAULT;
}
/* ...then finally give it a kicking to make sure it fits */
- base_pfn = max_t(unsigned long, base_pfn,
- domain->geometry.aperture_start >> order);
+ base_pfn = max(base, domain->geometry.aperture_start) >> order;
}
/* start_pfn is always nonzero for an already-initialised domain */
@@ -1760,7 +1757,7 @@ void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit)
* underlying IOMMU driver needs to support via the dma-iommu layer.
*/
if (iommu_is_dma_domain(domain)) {
- if (iommu_dma_init_domain(domain, dma_base, dma_limit, dev))
+ if (iommu_dma_init_domain(domain, dev))
goto out_err;
dev->dma_ops = &iommu_dma_ops;
}