Re: [PATCH v2 5/7] iommu/riscv: Device directory management.

From: Tomasz Jeznach
Date: Wed Apr 24 2024 - 19:01:25 EST


On Fri, Apr 19, 2024 at 5:40 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
>
> On Thu, Apr 18, 2024 at 09:32:23AM -0700, Tomasz Jeznach wrote:
> > @@ -31,13 +32,350 @@ MODULE_LICENSE("GPL");
> > /* Timeouts in [us] */
> > #define RISCV_IOMMU_DDTP_TIMEOUT 50000
> >
> > -static int riscv_iommu_attach_identity_domain(struct iommu_domain *domain,
> > - struct device *dev)
> > +/* RISC-V IOMMU PPN <> PHYS address conversions, PHYS <=> PPN[53:10] */
> > +#define phys_to_ppn(va) (((va) >> 2) & (((1ULL << 44) - 1) << 10))
> > +#define ppn_to_phys(pn) (((pn) << 2) & (((1ULL << 44) - 1) << 12))
> > +
> > +#define dev_to_iommu(dev) \
> > + container_of((dev)->iommu->iommu_dev, struct riscv_iommu_device, iommu)
>
> We have iommu_get_iommu_dev() now
>
> > +static unsigned long riscv_iommu_get_pages(struct riscv_iommu_device *iommu, unsigned int order)
> > +{
> > + struct riscv_iommu_devres *devres;
> > + struct page *pages;
> > +
> > + pages = alloc_pages_node(dev_to_node(iommu->dev),
> > + GFP_KERNEL_ACCOUNT | __GFP_ZERO, order);
> > + if (unlikely(!pages)) {
> > + dev_err(iommu->dev, "Page allocation failed, order %u\n", order);
> > + return 0;
> > + }
>
> This needs adjusting for the recently merged allocation accounting
>
> > +static int riscv_iommu_attach_domain(struct riscv_iommu_device *iommu,
> > + struct device *dev,
> > + struct iommu_domain *iommu_domain)
> > +{
> > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > + struct riscv_iommu_dc *dc;
> > + u64 fsc, ta, tc;
> > + int i;
> > +
> > + if (!iommu_domain) {
> > + ta = 0;
> > + tc = 0;
> > + fsc = 0;
> > + } else if (iommu_domain->type == IOMMU_DOMAIN_IDENTITY) {
> > + ta = 0;
> > + tc = RISCV_IOMMU_DC_TC_V;
> > + fsc = FIELD_PREP(RISCV_IOMMU_DC_FSC_MODE, RISCV_IOMMU_DC_FSC_MODE_BARE);
> > + } else {
> > + /* This should never happen. */
> > + return -ENODEV;
> > + }
>
> Please don't write it like this. This function is already being called
> by functions that are already under specific ops, don't check
> domain->type here.
>
> Instead have the caller compute and pass in the ta/tc/fsc
> values. Maybe in a tidy struct..
>
> > + /* Update existing or allocate new entries in device directory */
> > + for (i = 0; i < fwspec->num_ids; i++) {
> > + dc = riscv_iommu_get_dc(iommu, fwspec->ids[i], !iommu_domain);
> > + if (!dc && !iommu_domain)
> > + continue;
> > + if (!dc)
> > + return -ENODEV;
>
> But if this fails some of the fwspecs were left in a weird state ?
>
> Drivers should try hard to have attach functions that fail and make no
> change at all or fully succeed.
>
> Meaning ideally preallocate any required memory before doing any
> change to the HW visable structures.
>

Good point. Done.
Looking at the fwspec->ids[] I'm assuming nobody will add/modify the
IDs after iommu_probe_device() completes.

> > +
> > + /* Swap device context, update TC valid bit as the last operation */
> > + xchg64(&dc->fsc, fsc);
> > + xchg64(&dc->ta, ta);
> > + xchg64(&dc->tc, tc);
>
> This doesn't loook right? When you get to adding PAGING suport fsc has
> the page table pfn and ta has the cache tag, so this will end up
> tearing the data for sure, eg when asked to replace a PAGING domain
> with another PAGING domain? That will create a functional/security
> problem, right?
>
> I would encourage you to re-use the ARM sequencing code, ideally moved
> to some generic helper library. Every iommu driver dealing with
> multi-quanta descriptors seems to have this same fundamental
> sequencing problem.
>

Good point. Reworked.

> > +static void riscv_iommu_release_device(struct device *dev)
> > +{
> > + struct riscv_iommu_device *iommu = dev_to_iommu(dev);
> > +
> > + riscv_iommu_attach_domain(iommu, dev, NULL);
> > +}
>
> The release_domain has landed too now. Please don't invent weird NULL
> domain types that have special meaning. I assume clearing the V bit is
> a blocking behavior? So please implement a proper blocking domain and
> set release_domain = &riscv_iommu_blocking and just omit this release
> function.
>

Updated to use release_domain, should be cleaner now.
Clearing TC.V is a blocking (but noisy) behavior, should be fine for
release domain where devices should be quiesced already.

> > @@ -133,12 +480,14 @@ int riscv_iommu_init(struct riscv_iommu_device *iommu)
> > rc = riscv_iommu_init_check(iommu);
> > if (rc)
> > return dev_err_probe(iommu->dev, rc, "unexpected device state\n");
> > - /*
> > - * Placeholder for a complete IOMMU device initialization.
> > - * For now, only bare minimum: enable global identity mapping mode and register sysfs.
> > - */
> > - riscv_iommu_writeq(iommu, RISCV_IOMMU_REG_DDTP,
> > - FIELD_PREP(RISCV_IOMMU_DDTP_MODE, RISCV_IOMMU_DDTP_MODE_BARE));
> > +
> > + rc = riscv_iommu_ddt_alloc(iommu);
> > + if (WARN(rc, "cannot allocate device directory\n"))
> > + goto err_init;
>
> memory allocation failure already makes noisy prints, more prints are
> not needed..
>
> > + rc = riscv_iommu_set_ddtp_mode(iommu, RISCV_IOMMU_DDTP_MODE_MAX);
> > + if (WARN(rc, "cannot enable iommu device\n"))
> > + goto err_init;
>
> This is not a proper use of WARN, it should only be used for things
> that cannot happen not undesired error paths.
>
> Jason

Thanks, ack to all. Will push updated v3 shortly.
- Tomasz