Re: [RFC PATCH v6 07/20] vfio/iommu_type1: implement the VFIO_DMA_MAP_FLAG_NOEXEC flag

From: Alex Williamson
Date: Thu Jun 05 2014 - 16:48:56 EST


On Thu, 2014-06-05 at 19:03 +0200, Antonios Motakis wrote:
> Some IOMMU drivers, such as the ARM SMMU driver, make available the
> IOMMU_NOEXEC flag, to set the page tables for a device as XN (execute never).
> This affects devices such as the ARM PL330 DMA Controller, which respects
> this flag and will refuse to fetch DMA instructions from memory where the
> XN flag has been set.
>
> Signed-off-by: Antonios Motakis <a.motakis@xxxxxxxxxxxxxxxxxxxxxx>
> ---
> drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++++++++++++-
> 1 file changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 6673e7b..e2566fd 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -80,6 +80,24 @@ struct vfio_group {
> struct list_head next;
> };
>
> +static int vfio_domains_have_cap_noexec(struct vfio_iommu *iommu)
> +{
> + struct vfio_domain *d;
> + int ret = 1;
> +
> + mutex_lock(&iommu->lock);
> + list_for_each_entry(d, &iommu->domain_list, next) {
> + if (!iommu_domain_has_cap(d->domain, IOMMU_CAP_NOEXEC)) {
> + ret = 0;
> + break;
> + }
> + }
> + mutex_unlock(&iommu->lock);
> +
> + return ret;
> +}
> +
> +
> /*
> * This code handles mapping and unmapping of user data buffers
> * into DMA'ble space using the IOMMU
> @@ -542,6 +560,11 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> prot |= IOMMU_WRITE;
> if (map->flags & VFIO_DMA_MAP_FLAG_READ)
> prot |= IOMMU_READ;
> + if (map->flags & VFIO_DMA_MAP_FLAG_NOEXEC) {
> + if (!vfio_domains_have_cap_noexec(iommu))
> + return -EINVAL;
> + prot |= IOMMU_NOEXEC;
> + }
>
> if (!prot)
> return -EINVAL; /* No READ/WRITE? */
> @@ -899,6 +922,10 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
> if (!iommu)
> return 0;
> return vfio_domains_have_iommu_cache(iommu);
> + case VFIO_IOMMU_PROT_NOEXEC:
> + if (!iommu)
> + return 0;
> + return vfio_domains_have_cap_noexec(iommu);
> default:
> return 0;
> }
> @@ -922,7 +949,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
> } else if (cmd == VFIO_IOMMU_MAP_DMA) {
> struct vfio_iommu_type1_dma_map map;
> uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
> - VFIO_DMA_MAP_FLAG_WRITE;
> + VFIO_DMA_MAP_FLAG_WRITE |
> + VFIO_DMA_MAP_FLAG_NOEXEC;
>
> minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
>

This doesn't look complete. What happens if we support NOEXEC, then we
add a group behind an IOMMU that doesn't support NOEXEC? In one case we
attempt to use the same domain, which might imply the same IOMMU page
tables, but is the hardware going to support that? In the other case we
create a new domain, then try to replay the mappings, including the
NOEXEC bit... what's that going to do? Can we add a non-NOEXEC domain
to a container that was previously NOEXEC? The point of changing the
flag from EXEC to NOEXEC was to make it enforceable, but doesn't that
mean we need to prevent that happening if we have NOEXEC mappings?

Something needs to happen in vfio_iommu_type1_attach_group() to figure
out what's allowed. Also, vfio_domains_have_cap_noexec() should be
named vfio_domains_have_iommu_nexec() to match _iommu_cache() and
possibly they should both be wrappers to a common function if they end
up sharing similar implementations. Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/