Re: [PATCH v2 5/7] iommu/riscv: Device directory management.
From: Jason Gunthorpe
Date: Fri Apr 19 2024 - 08:40:25 EST
On Thu, Apr 18, 2024 at 09:32:23AM -0700, Tomasz Jeznach wrote:
> @@ -31,13 +32,350 @@ MODULE_LICENSE("GPL");
> /* Timeouts in [us] */
> #define RISCV_IOMMU_DDTP_TIMEOUT 50000
>
> -static int riscv_iommu_attach_identity_domain(struct iommu_domain *domain,
> - struct device *dev)
> +/* RISC-V IOMMU PPN <> PHYS address conversions, PHYS <=> PPN[53:10] */
> +#define phys_to_ppn(va) (((va) >> 2) & (((1ULL << 44) - 1) << 10))
> +#define ppn_to_phys(pn) (((pn) << 2) & (((1ULL << 44) - 1) << 12))
> +
> +#define dev_to_iommu(dev) \
> + container_of((dev)->iommu->iommu_dev, struct riscv_iommu_device, iommu)
We have iommu_get_iommu_dev() now
> +static unsigned long riscv_iommu_get_pages(struct riscv_iommu_device *iommu, unsigned int order)
> +{
> + struct riscv_iommu_devres *devres;
> + struct page *pages;
> +
> + pages = alloc_pages_node(dev_to_node(iommu->dev),
> + GFP_KERNEL_ACCOUNT | __GFP_ZERO, order);
> + if (unlikely(!pages)) {
> + dev_err(iommu->dev, "Page allocation failed, order %u\n", order);
> + return 0;
> + }
This needs adjusting for the recently merged allocation accounting
> +static int riscv_iommu_attach_domain(struct riscv_iommu_device *iommu,
> + struct device *dev,
> + struct iommu_domain *iommu_domain)
> +{
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> + struct riscv_iommu_dc *dc;
> + u64 fsc, ta, tc;
> + int i;
> +
> + if (!iommu_domain) {
> + ta = 0;
> + tc = 0;
> + fsc = 0;
> + } else if (iommu_domain->type == IOMMU_DOMAIN_IDENTITY) {
> + ta = 0;
> + tc = RISCV_IOMMU_DC_TC_V;
> + fsc = FIELD_PREP(RISCV_IOMMU_DC_FSC_MODE, RISCV_IOMMU_DC_FSC_MODE_BARE);
> + } else {
> + /* This should never happen. */
> + return -ENODEV;
> + }
Please don't write it like this. This function is already being called
by functions that are already under specific ops, don't check
domain->type here.
Instead have the caller compute and pass in the ta/tc/fsc
values. Maybe in a tidy struct..
> + /* Update existing or allocate new entries in device directory */
> + for (i = 0; i < fwspec->num_ids; i++) {
> + dc = riscv_iommu_get_dc(iommu, fwspec->ids[i], !iommu_domain);
> + if (!dc && !iommu_domain)
> + continue;
> + if (!dc)
> + return -ENODEV;
But if this fails some of the fwspecs were left in a weird state ?
Drivers should try hard to have attach functions that fail and make no
change at all or fully succeed.
Meaning ideally preallocate any required memory before doing any
change to the HW visable structures.
> +
> + /* Swap device context, update TC valid bit as the last operation */
> + xchg64(&dc->fsc, fsc);
> + xchg64(&dc->ta, ta);
> + xchg64(&dc->tc, tc);
This doesn't loook right? When you get to adding PAGING suport fsc has
the page table pfn and ta has the cache tag, so this will end up
tearing the data for sure, eg when asked to replace a PAGING domain
with another PAGING domain? That will create a functional/security
problem, right?
I would encourage you to re-use the ARM sequencing code, ideally moved
to some generic helper library. Every iommu driver dealing with
multi-quanta descriptors seems to have this same fundamental
sequencing problem.
> +static void riscv_iommu_release_device(struct device *dev)
> +{
> + struct riscv_iommu_device *iommu = dev_to_iommu(dev);
> +
> + riscv_iommu_attach_domain(iommu, dev, NULL);
> +}
The release_domain has landed too now. Please don't invent weird NULL
domain types that have special meaning. I assume clearing the V bit is
a blocking behavior? So please implement a proper blocking domain and
set release_domain = &riscv_iommu_blocking and just omit this release
function.
> @@ -133,12 +480,14 @@ int riscv_iommu_init(struct riscv_iommu_device *iommu)
> rc = riscv_iommu_init_check(iommu);
> if (rc)
> return dev_err_probe(iommu->dev, rc, "unexpected device state\n");
> - /*
> - * Placeholder for a complete IOMMU device initialization.
> - * For now, only bare minimum: enable global identity mapping mode and register sysfs.
> - */
> - riscv_iommu_writeq(iommu, RISCV_IOMMU_REG_DDTP,
> - FIELD_PREP(RISCV_IOMMU_DDTP_MODE, RISCV_IOMMU_DDTP_MODE_BARE));
> +
> + rc = riscv_iommu_ddt_alloc(iommu);
> + if (WARN(rc, "cannot allocate device directory\n"))
> + goto err_init;
memory allocation failure already makes noisy prints, more prints are
not needed..
> + rc = riscv_iommu_set_ddtp_mode(iommu, RISCV_IOMMU_DDTP_MODE_MAX);
> + if (WARN(rc, "cannot enable iommu device\n"))
> + goto err_init;
This is not a proper use of WARN, it should only be used for things
that cannot happen not undesired error paths.
Jason