Re: [PATCH v4] /dev/mem: Revoke mappings when a driver claims the region
From: Kees Cook
Date: Thu Apr 07 2022 - 23:35:10 EST
On Thu, Apr 07, 2022 at 04:43:10PM -0700, Dan Williams wrote:
> On Thu, Apr 7, 2022 at 11:47 AM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> >
> > On Wed, Apr 6, 2022 at 12:46 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> > >
> > > *thread necromancy*
> >
> > It's alive!
> >
> > >
> > > Hi Dan,
> > >
> > > I'm doing a KSPP bug scrub and am reviewing
> > > https://github.com/KSPP/linux/issues/74 again.
> > >
> > > Do you have a chance to look at this? I'd love a way to make mmap()
> > > behave the same way as read() for the first meg of /dev/mem.
> >
> > You want 0-reads or SIGBUS when attempting to access the first 1MB?
> >
> > Because it sounds like what you want is instead of loudly failing with
> > -EPERM in drivers/char/mem.c::mmap_mem() you want it to silently
> > succeed but swap in the zero page, right? Otherwise if it's SIGBUS
> > then IO_STRICT_DEVMEM=y + marking that span as IORESOURCE_BUSY will
> > "Do the Right Thing (TM).".
>
> In other words, if IO_STRICT_DEVMEM is enabled then the enforcement is
> already there at least for anything marked IORESOURCE_BUSY. So if
> tools are ok with that protection today, maybe there is no need to do
> the zero page dance. I.e. legacy tools the read(2) /dev/mem below 1MB
> get zeroes, and apparently no tools were mmap'ing below 1MB otherwise
> they would have complained by now? At least Fedora is shipping
> IO_STRICT_DEVMEM these days:
>
> https://src.fedoraproject.org/rpms/kernel/blob/rawhide/f/kernel-x86_64-fedora.config#_2799
When I try to mmap a RAM area <1MiB, mmap succeeds (range_is_allowed()
is non-zero), so I don't think IO_STRICT_DEVMEM would trip anything
using mmap on /dev/mem there.
I am only reading 0s from there, though, but I don't see what's all
happening. I thought maybe it was just literally unused, but even with
CONFIG_PAGE_POISONING=y booted with page_poison=1, I still read 0s (not
0xaa), but I'd like to understand _why_ (i.e. I can't tell if it is
accidentally safe, intentionally safe, or my test is bad.)
For example:
# cat /proc/iomem
00000000-00000fff : Reserved
00001000-0009fbff : System RAM
0009fc00-0009ffff : Reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c99ff : Video ROM
...
If I mmap page 0, it's rejected (non-RAM). If I mmap page 1, it works,
but it's all 0s. (Which is what I'd like, but I don't see where this is
happening.)
Hmmm.
--
Kees Cook