[PATCH 2/3] mmap: fix the usage of ->vm_pgoff in special_mapping paths

From: Oleg Nesterov
Date: Sun Jun 21 2015 - 17:09:27 EST


#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <assert.h>

void *find_vdso_vaddr(void)
FILE *perl;
char buf[32] = {};

perl = popen("perl -e 'open STDIN,qq|/proc/@{[getppid]}/maps|;"
"/^(.*?)-.*vdso/ && print hex $1 while <>'", "r");
fread(buf, sizeof(buf), 1, perl);

return (void *)atol(buf);

#define PAGE_SIZE 4096

int main(void)
void *vdso = find_vdso_vaddr();

// of course they should differ, and they do so far
printf("vdso pages differ: %d\n",
!!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));

// split into 2 vma's
assert(mprotect(vdso, PAGE_SIZE, PROT_READ) == 0);

// force another fault on the next check
assert(madvise(vdso, 2 * PAGE_SIZE, MADV_DONTNEED) == 0);

// now they no longer differ, the 2nd vm_pgoff is wrong
printf("vdso pages differ: %d\n",
!!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));

return 0;


vdso pages differ: 1
vdso pages differ: 0

This is because split_vma() correctly updates ->vm_pgoff, but the logic
in insert_vm_struct() and special_mapping_fault() is absolutely broken,
so the fault at vdso + PAGE_SIZE return the 1st page. The same happens
if you simply unmap the 1st page.

special_mapping_fault() does:

pgoff = vmf->pgoff - vma->vm_pgoff;

and this is _only_ correct if vma->vm_start mmaps the first page from
->vm_private_data array.

vdso or any other user of install_special_mapping() is not anonymous,
it has the "backing storage" even if it is just the array of pages.
So we actually need to make vm_pgoff work as an offset in this array.

Note: this also allows to fix another problem: currently gdb can't access
"[vvar]" memory because in this case special_mapping_fault() doesn't work.
Now that we can use ->vm_pgoff we can implement ->access() and fix this.

Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
mm/mmap.c | 12 ++----------
1 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index bb50cac..992417f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2871,7 +2871,7 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
* using the existing file pgoff checks and manipulations.
* Similarly in do_mmap_pgoff and in do_brk.
- if (!vma->vm_file) {
+ if (vma_is_anonymous(vma)) {
vma->vm_pgoff = vma->vm_start >> PAGE_SHIFT;
@@ -3013,21 +3013,13 @@ static int special_mapping_fault(struct vm_area_struct *vma,
pgoff_t pgoff;
struct page **pages;

- /*
- * special mappings have no vm_file, and in that case, the mm
- * uses vm_pgoff internally. So we have to subtract it from here.
- * We are allowed to do this because we are the mm; do not copy
- * this code into drivers!
- */
- pgoff = vmf->pgoff - vma->vm_pgoff;
if (vma->vm_ops == &legacy_special_mapping_vmops)
pages = vma->vm_private_data;
pages = ((struct vm_special_mapping *)vma->vm_private_data)->

- for (; pgoff && *pages; ++pages)
+ for (pgoff = vmf->pgoff; pgoff && *pages; ++pages)

if (*pages) {

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at http://www.tux.org/lkml/