Re: [PATCH 02/11] unpaged: private write VM_RESERVED

From: David S. Miller
Date: Thu Nov 17 2005 - 18:37:17 EST


From: Hugh Dickins <hugh@xxxxxxxxxxx>
Date: Thu, 17 Nov 2005 19:30:04 +0000 (GMT)

> The PageReserved removal in 2.6.15-rc1 issued a "deprecated" message
> when you tried to mmap or mprotect MAP_PRIVATE PROT_WRITE a VM_RESERVED,
> and failed with -EACCES: because do_wp_page lacks the refinement to COW
> pages in those areas, nor do we expect to find anonymous pages in them;
> and it seemed just bloat to add code for handling such a peculiar case.
> But immediately it caused vbetool and ddcprobe (using lrmi) to fail.
>
> So revert the "deprecated" messages, letting mmap and mprotect succeed.
> But leave do_wp_page's BUG_ON(vma->vm_flags & VM_RESERVED) in place
> until we've added the code to do it right: so this particular patch is
> only good if the app doesn't really need to write to that private area.
>
> Dave Jones has changed vbetool & ddcprobe to use MAP_SHARED or PROT_READ
> just as well, but we don't want to force people to update their tools.
>
> Signed-off-by: Hugh Dickins <hugh@xxxxxxxxxxx>

lrmi makes two mmaps of interest on "/dev/mem":

1) the lowest page, which includes the interrupt vectors and
the BIOS data area

the BIOS code it executes does need to write to that BIOS
data area, so PROT_WRITE is necessary, but more on this...

2) the ROM image from 0xa0000 -> 0x100000, no PROT_WRITE necessary

But the thing about #1, which needs the PROT_WRITE, is that it does
not want anonymous pages for that stuff, it's mapping the real
physical memory at those addresses, and indeed /dev/mem is going to
setup all the damn page tables already regardless of whether
MAP_SHARED or MAP_PRIVATE is set, and logically indeed MAP_SHARED is
the thing which should be specified here because this mapping is
not "private" to the application in any sense.

That is, /dev/mem mmap()'s do the remap_pfn_range() for the whole area
being mmap()'d (which is where the VM_RESERVED comes from), and
therefore no COW page faults should ever occur for such areas. Faults
can occur for protection violations or memory errors, but that's it.

I would even argue that MAP_PRIVATE on things like /dev/mem should be
flagged with at least a kernel log message if not an outright -EINVAL
as well.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/