Re: [PATCH 4.4 095/101] iommu: Handle default domain attach failure

From: Ben Hutchings
Date: Wed Jul 05 2017 - 14:53:15 EST


On Mon, 2017-07-03 at 15:35 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Robin Murphy <robin.murphy@xxxxxxx>
>
> commit 797a8b4d768c58caac58ee3e8cb36a164d1b7751 upstream.
>
> We wouldn't normally expect ops->attach_dev() to fail, but on IOMMUs
> with limited hardware resources, or generally misconfigured systems,
> it is certainly possible. We report failure correctly from the external
> iommu_attach_device() interface, but do not do so in iommu_group_add()
> when attaching to the default domain. The result of failure there is
> that the device, group and domain all get left in a broken,
> part-configured state which leads to weird errors and misbehaviour down
> the line when IOMMU API calls sort-of-but-don't-quite work.
>
> Check the return value of __iommu_attach_device() on the default domain,
> and refactor the error handling paths to cope with its failure and clean
> up correctly in such cases.
[...]
> @@ -432,8 +426,10 @@ rename:
> mutex_lock(&group->mutex);
> list_add_tail(&device->list, &group->devices);
> if (group->domain)
> - __iommu_attach_device(group->domain, dev);
> + ret = __iommu_attach_device(group->domain, dev);
> mutex_unlock(&group->mutex);
> + if (ret)
> + goto err_put_group;

It's still (briefly) possible for other tasks to observe the device in
the broken state. Shouldn't the error check be done before
mutex_unlock()?

> /* Notify any listeners about change to group. */
> blocking_notifier_call_chain(&group->notifier,
> @@ -444,6 +440,21 @@ rename:
> pr_info("Adding device %s to group %d\n", dev_name(dev), group->id);
>
> return 0;
> +
> +err_put_group:
> + mutex_lock(&group->mutex);
> + list_del(&device->list);
> + mutex_unlock(&group->mutex);
> + dev->iommu_group = NULL;
> + kobject_put(group->devices_kobj);
> +err_free_name:
> + kfree(device->name);
> +err_remove_link:
> + sysfs_remove_link(&dev->kobj, "iommu_group");
> +err_free_device:
> + kfree(device);
> + pr_err("Failed to add device %s to group %d: %d\n", dev_name(dev), group->id, ret);
> + return ret;
> }
> EXPORT_SYMBOL_GPL(iommu_group_add_device);
>

It seems like this cleanup statement in iommu_group_remove_device()
should also be done here under err_put_group:

sysfs_remove_link(group->devices_kobj, device->name);

Ben.

--
Ben Hutchings
Software Developer, Codethink Ltd.