Re: [PATCH 1/2] mm,migration: During fork(), wait for migration to end if migration PTE is encountered

From: Minchan Kim
Date: Mon Apr 26 2010 - 19:29:00 EST


On Tue, Apr 27, 2010 at 7:37 AM, Mel Gorman <mel@xxxxxxxxx> wrote:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
>
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
>
> At page migration, we replace pte with migration_entry, which has
> similar format as swap_entry and replace it with real pfn at the
> end of migration. But there is a race with fork()'s copy_page_range().
>
> Assume page migraion on CPU A and fork in CPU B. On CPU A, a page of
> a process is under migration. On CPU B, a page's pte is under copy.
>
> Â Â Â ÂCPUA Â Â Â Â Â Â Â Â Â ÂCPU B
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âdo_fork()
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âcopy_mm() (from process 1 to process2)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âinsert new vma to mmap_list (if inode/anon_vma)
> Â Â Â Âpte_lock(process1)
> Â Â Â Âunmap a page
> Â Â Â Âinsert migration_entry
> Â Â Â Âpte_unlock(process1)
>
> Â Â Â Âmigrate page copy
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âcopy_page_range
> Â Â Â Âremap new page by rmap_walk()
> Â Â Â Âpte_lock(process2)
> Â Â Â Âfound no pte.
> Â Â Â Âpte_unlock(process2)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âpte lock(process2)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âpte lock(process1)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âcopy migration entry to process2
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âpte unlock(process1)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âpte unlokc(process2)
> Â Â Â Âpte_lock(process1)
> Â Â Â Âreplace migration entry
> Â Â Â Âto new page's pte.
> Â Â Â Âpte_unlock(process1)
>
> Then, some serialization is necessary. IIUC, this is very rare event but
> it is reproducible if a lot of migration is happening a lot with the
> following program running in parallel.
>
> Â Â#include <stdio.h>
> Â Â#include <string.h>
> Â Â#include <stdlib.h>
> Â Â#include <sys/mman.h>
>
> Â Â#define SIZE (24*1048576UL)
> Â Â#define CHILDREN 100
> Â Âint main()
> Â Â{
> Â Â Â Â Â Âint i = 0;
> Â Â Â Â Â Âpid_t pids[CHILDREN];
> Â Â Â Â Â Âchar *buf = mmap(NULL, SIZE, PROT_READ|PROT_WRITE,
> Â Â Â Â Â Â Â Â Â Â Â Â Â ÂMAP_PRIVATE|MAP_ANONYMOUS,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â0, 0);
> Â Â Â Â Â Âif (buf == MAP_FAILED) {
> Â Â Â Â Â Â Â Â Â Âperror("mmap");
> Â Â Â Â Â Â Â Â Â Âexit(-1);
> Â Â Â Â Â Â}
>
> Â Â Â Â Â Âwhile (++i) {
> Â Â Â Â Â Â Â Â Â Âint j = i % CHILDREN;
>
> Â Â Â Â Â Â Â Â Â Âif (j == 0) {
> Â Â Â Â Â Â Â Â Â Â Â Â Â Âprintf("Waiting on children\n");
> Â Â Â Â Â Â Â Â Â Â Â Â Â Âfor (j = 0; j < CHILDREN; j++) {
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âmemset(buf, i, SIZE);
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âif (pids[j] != -1)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âwaitpid(pids[j], NULL, 0);
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â}
> Â Â Â Â Â Â Â Â Â Â Â Â Â Âj = 0;
> Â Â Â Â Â Â Â Â Â Â}
>
> Â Â Â Â Â Â Â Â Â Âif ((pids[j] = fork()) == 0) {
> Â Â Â Â Â Â Â Â Â Â Â Â Â Âmemset(buf, i, SIZE);
> Â Â Â Â Â Â Â Â Â Â Â Â Â Âexit(EXIT_SUCCESS);
> Â Â Â Â Â Â Â Â Â Â}
> Â Â Â Â Â Â}
>
> Â Â Â Â Â Âmunmap(buf, SIZE);
> Â Â}
>
> copy_page_range() can wait for the end of migration.
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> Signed-off-by: Mel Gorman <mel@xxxxxxxxx>
Reviewed-by : Minchan Kim <minchan.kim@xxxxxxxxx>

--
Kind regards,
Minchan Kim
¢éì®&Þ~º&¶¬–+-±éÝ¥Šw®žË±Êâmébžìdz¹Þ)í…æèw*jg¬±¨¶‰šŽŠÝj/êäz¹ÞŠà2ŠÞ¨è­Ú&¢)ß«a¶Úþø®G«éh®æj:+v‰¨Šwè†Ù>Wš±êÞiÛaxPjØm¶Ÿÿà -»+ƒùdš_