Re: [PATCH] proc: pagemap: Hold mmap_sem during page walk

From: Linus Torvalds
Date: Thu Apr 01 2010 - 11:15:45 EST

On Thu, 1 Apr 2010, KAMEZAWA Hiroyuki wrote:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> In initial design, walk_page_range() was designed just for walking page table and
> it didn't require mmap_sem. Now, find_vma() etc.. are used in walk_page_range()
> and we need mmap_sem around it.
> This patch adds mmap_sem around walk_page_range().
> Because /proc/<pid>/pagemap's callback routine use put_user(), we have to get
> rid of it to do sane fix.
> Changelog:
> - fixed start_vaddr calculation
> - removed unnecessary cast.
> - removed unnecessary change in smaps.
> - use GFP_TEMPORARY instead of GFP_KERNEL
> - use min().

Looks mostly correct to me (but just looking at the source, no testing,
obviously). And I like how the double buffering removes more lines of code
than it adds.

However, I think there is a subtle problem with this:

> + while (count && (start_vaddr < end_vaddr)) {
> + int len;
> + unsigned long end;
> +
> + pm.pos = 0;
> + end = min(start_vaddr + PAGEMAP_WALK_SIZE, end_vaddr);
> + down_read(&mm->mmap_sem);
> + ret = walk_page_range(start_vaddr, end, &pagemap_walk);
> + up_read(&mm->mmap_sem);
> + start_vaddr += PAGEMAP_WALK_SIZE;

I think "start_vaddr + PAGEMAP_WALK_SIZE" might overflow, and then 'end'
ends up being odd. You'll never notice on architectures where the user
space doesn't go all the way up to the end (walk_page_range will return 0
etc), but it will do the wrong thing if 'start' is close to the end, end
is _at_ the end, and you'll not be able to read that range (because of the

So I do think you should do something like

end = start_vaddr + PAGEMAP_WALK_SIZE;
/* overflow? or final chunk? */
if (end < start_vaddr || end > end_vaddr)
end = end_vaddr;

instead of using 'min()'.

(This only matters if TASK_SIZE_OF() can be ~0ul, but I think that can
happen on sparc, for example)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at