Re: [PATCH v2] /dev/mem: Revoke mappings when a driver claims the region
From: Dan Williams
Date: Tue May 19 2020 - 14:27:17 EST
On Tue, May 19, 2020 at 5:11 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, May 19, 2020 at 12:03:06AM -0700, Dan Williams wrote:
> > Close the hole of holding a mapping over kernel driver takeover event of
> > a given address range.
> >
> > Commit 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
> > introduced CONFIG_IO_STRICT_DEVMEM with the goal of protecting the
> > kernel against scenarios where a /dev/mem user tramples memory that a
> > kernel driver owns. However, this protection only prevents *new* read(),
> > write() and mmap() requests. Established mappings prior to the driver
> > calling request_mem_region() are left alone.
> >
> > Especially with persistent memory, and the core kernel metadata that is
> > stored there, there are plentiful scenarios for a /dev/mem user to
> > violate the expectations of the driver and cause amplified damage.
> >
> > Teach request_mem_region() to find and shoot down active /dev/mem
> > mappings that it believes it has successfully claimed for the exclusive
> > use of the driver. Effectively a driver call to request_mem_region()
> > becomes a hole-punch on the /dev/mem device.
> >
> > The typical usage of unmap_mapping_range() is part of
> > truncate_pagecache() to punch a hole in a file, but in this case the
> > implementation is only doing the "first half" of a hole punch. Namely it
> > is just evacuating current established mappings of the "hole", and it
> > relies on the fact that /dev/mem establishes mappings in terms of
> > absolute physical address offsets. Once existing mmap users are
> > invalidated they can attempt to re-establish the mapping, or attempt to
> > continue issuing read(2) / write(2) to the invalidated extent, but they
> > will then be subject to the CONFIG_IO_STRICT_DEVMEM checking that can
> > block those subsequent accesses.
> >
> > Cc: Arnd Bergmann <arnd@xxxxxxxx>
> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> > Cc: Russell King <linux@xxxxxxxxxxxxxxxx>
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > Fixes: 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
> > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > ---
> > Changes since v1 [1]:
> >
> > - updated the changelog to describe the usage of unmap_mapping_range().
> > No other logic changes:
> >
> > [1]: http://lore.kernel.org/r/158662721802.1893045.12301414116114602646.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > Greg, Andrew,
> >
> > I have a regression test for this case now. This was found by an
> > intermittent data corruption scenario on pmem from a test tool using
> > /dev/mem.
>
> Ick, why are test tools messing around in /dev/mem :)
Yeah, I'm all for useful tools, just not at the expense of kernel integrity.
> Anyway, this seems sane to me, want me to take it through my tree?
Yes please, seems to belong with the driver core.
Thanks!