Re: [PATCH v5 15/15] PCI: Revoke mappings like devmem
From: Bjorn Helgaas
Date: Tue Nov 03 2020 - 16:30:57 EST
On Fri, Oct 30, 2020 at 11:08:15AM +0100, Daniel Vetter wrote:
> Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> the region") /dev/kmem zaps ptes when the kernel requests exclusive
> acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> the default for all driver uses.
>
> Except there's two more ways to access PCI BARs: sysfs and proc mmap
> support. Let's plug that hole.
>
> For revoke_devmem() to work we need to link our vma into the same
> address_space, with consistent vma->vm_pgoff. ->pgoff is already
> adjusted, because that's how (io_)remap_pfn_range works, but for the
> mapping we need to adjust vma->vm_file->f_mapping. The cleanest way is
> to adjust this at at ->open time:
>
> - for sysfs this is easy, now that binary attributes support this. We
> just set bin_attr->mapping when mmap is supported
> - for procfs it's a bit more tricky, since procfs pci access has only
> one file per device, and access to a specific resources first needs
> to be set up with some ioctl calls. But mmap is only supported for
> the same resources as sysfs exposes with mmap support, and otherwise
> rejected, so we can set the mapping unconditionally at open time
> without harm.
>
> A special consideration is for arch_can_pci_mmap_io() - we need to
> make sure that the ->f_mapping doesn't alias between ioport and iomem
> space. There's only 2 ways in-tree to support mmap of ioports: generic
> pci mmap (ARCH_GENERIC_PCI_MMAP_RESOURCE), and sparc as the single
> architecture hand-rolling. Both approach support ioport mmap through a
> special pfn range and not through magic pte attributes. Aliasing is
> therefore not a problem.
>
> The only difference in access checks left is that sysfs PCI mmap does
> not check for CAP_RAWIO. I'm not really sure whether that should be
> added or not.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> Cc: Jason Gunthorpe <jgg@xxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: John Hubbard <jhubbard@xxxxxxxxxx>
> Cc: Jérôme Glisse <jglisse@xxxxxxxxxx>
> Cc: Jan Kara <jack@xxxxxxx>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: linux-mm@xxxxxxxxx
> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> Cc: linux-samsung-soc@xxxxxxxxxxxxxxx
> Cc: linux-media@xxxxxxxxxxxxxxx
> Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Cc: linux-pci@xxxxxxxxxxxxxxx
> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxx>
Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> --
> v2:
> - Totally new approach: Adjust filp->f_mapping at open time. Note that
> this now works on all architectures, not just those support
> ARCH_GENERIC_PCI_MMAP_RESOURCE
> ---
> drivers/pci/pci-sysfs.c | 4 ++++
> drivers/pci/proc.c | 1 +
> 2 files changed, 5 insertions(+)
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index d15c881e2e7e..3f1c31bc0b7c 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -929,6 +929,7 @@ void pci_create_legacy_files(struct pci_bus *b)
> b->legacy_io->read = pci_read_legacy_io;
> b->legacy_io->write = pci_write_legacy_io;
> b->legacy_io->mmap = pci_mmap_legacy_io;
> + b->legacy_io->mapping = iomem_get_mapping();
> pci_adjust_legacy_attr(b, pci_mmap_io);
> error = device_create_bin_file(&b->dev, b->legacy_io);
> if (error)
> @@ -941,6 +942,7 @@ void pci_create_legacy_files(struct pci_bus *b)
> b->legacy_mem->size = 1024*1024;
> b->legacy_mem->attr.mode = 0600;
> b->legacy_mem->mmap = pci_mmap_legacy_mem;
> + b->legacy_io->mapping = iomem_get_mapping();
> pci_adjust_legacy_attr(b, pci_mmap_mem);
> error = device_create_bin_file(&b->dev, b->legacy_mem);
> if (error)
> @@ -1156,6 +1158,8 @@ static int pci_create_attr(struct pci_dev *pdev, int num, int write_combine)
> res_attr->mmap = pci_mmap_resource_uc;
> }
> }
> + if (res_attr->mmap)
> + res_attr->mapping = iomem_get_mapping();
> res_attr->attr.name = res_attr_name;
> res_attr->attr.mode = 0600;
> res_attr->size = pci_resource_len(pdev, num);
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3a2f90beb4cb..9bab07302bbf 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -298,6 +298,7 @@ static int proc_bus_pci_open(struct inode *inode, struct file *file)
> fpriv->write_combine = 0;
>
> file->private_data = fpriv;
> + file->f_mapping = iomem_get_mapping();
>
> return 0;
> }
> --
> 2.28.0
>