Re: [PATCH 11/13] mm: add unsafe_follow_pfn

From: Jason Gunthorpe
Date: Wed Oct 07 2020 - 13:36:53 EST


On Wed, Oct 07, 2020 at 06:44:24PM +0200, Daniel Vetter wrote:
> Way back it was a reasonable assumptions that iomem mappings never
> change the pfn range they point at. But this has changed:
>
> - gpu drivers dynamically manage their memory nowadays, invalidating
> ptes with unmap_mapping_range when buffers get moved
>
> - contiguous dma allocations have moved from dedicated carvetouts to
> cma regions. This means if we miss the unmap the pfn might contain
> pagecache or anon memory (well anything allocated with GFP_MOVEABLE)
>
> - even /dev/mem now invalidates mappings when the kernel requests that
> iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
> ("/dev/mem: Revoke mappings when a driver claims the region")
>
> Accessing pfns obtained from ptes without holding all the locks is
> therefore no longer a good idea.
>
> Unfortunately there's some users where this is not fixable (like v4l
> userptr of iomem mappings) or involves a pile of work (vfio type1
> iommu). For now annotate these as unsafe and splat appropriately.
>
> This patch adds an unsafe_follow_pfn, which later patches will then
> roll out to all appropriate places.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> Cc: Jason Gunthorpe <jgg@xxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: John Hubbard <jhubbard@xxxxxxxxxx>
> Cc: Jérôme Glisse <jglisse@xxxxxxxxxx>
> Cc: Jan Kara <jack@xxxxxxx>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> Cc: linux-mm@xxxxxxxxx
> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> Cc: linux-samsung-soc@xxxxxxxxxxxxxxx
> Cc: linux-media@xxxxxxxxxxxxxxx
> Cc: kvm@xxxxxxxxxxxxxxx
> ---
> include/linux/mm.h | 2 ++
> mm/memory.c | 32 +++++++++++++++++++++++++++++++-
> mm/nommu.c | 17 +++++++++++++++++
> security/Kconfig | 13 +++++++++++++
> 4 files changed, 63 insertions(+), 1 deletion(-)

Makes sense to me.

I wonder if we could change the original follow_pfn to require the
ptep and then lockdep_assert_held() it against the page table lock?

> +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address,
> + unsigned long *pfn)
> +{
> +#ifdef CONFIG_STRICT_FOLLOW_PFN
> + pr_info("unsafe follow_pfn usage rejected, see
> CONFIG_STRICT_FOLLOW_PFN\n");

Wonder if we can print something useful here, like the current
PID/process name?

> diff --git a/security/Kconfig b/security/Kconfig
> index 7561f6f99f1d..48945402e103 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -230,6 +230,19 @@ config STATIC_USERMODEHELPER_PATH
> If you wish for all usermode helper programs to be disabled,
> specify an empty string here (i.e. "").
>
> +config STRICT_FOLLOW_PFN
> + bool "Disable unsafe use of follow_pfn"
> + depends on MMU

I would probably invert this CONFIG_ALLOW_UNSAFE_FOLLOW_PFN
default n

Jason