RE: [RFC PATCH 0/4]: affinity-on-next-touch

From: Lee Schermerhorn
Date: Thu Jun 18 2009 - 00:37:52 EST


On Wed, 2009-06-17 at 09:45 +0200, Stefan Lankes wrote:
> > I've placed the last rebased version in :
> >
> > http://free.linux.hp.com/~lts/Patches/PageMigration/2.6.28-rc4-mmotm-
> > 081110/
> >
>
> OK! I will try to reconstruct the problem.

Stefan:

Today I rebased the migrate on fault patches to 2.6.30-mmotm-090612...
[along with my shared policy series atop which they sit in my tree].
Patches reside in:

http://free.linux.hp.com/~lts/Patches/PageMigration/2.6.30-mmotm-090612-1220/


I did a quick test. I'm afraid the patches have suffered some "bit rot"
vis a vis mainline/mmotm over the past several months. Two possibly
related issues:

1) lazy migration doesn't seem to work. Looks like
mbind(<some-policy>+MPOL_MF_MOVE+MPOL_MF_LAZY) is not unmapping the
pages so, of course, migrate on fault won't work. I suspect the
reference count handling has changed since I last tried this. [Note one
of the patch conflicts was in the MPOL_MF_LAZY addition to the mbind
flag definitions in mempolicy.h and I may have botched the resolution
thereof.]

2) When the pages get freed on exit/unmap, they are still PageLocked()
and free_pages_check()/bad_page() bugs out with bad page state.

Note: This is independent of memcg--i.e., happens whether or not memcg
configured.

To test this, I created a test cpuset with all nodes/mems/cpus and
enabled migrate_on_fault therein. I then ran an interactive "memtoy"
session there [shown below]. Memtoy is a program I use for ad hoc
testing of various mm features. You can find the latest version [almost
always] at:

http://free.linux.hp.com/~lts/Tools/memtoy-latest.tar.gz

You'll need the numactl-devel package to build--an older one with the V1
api, I think. I need to upgrade it to latest libnuma.

The same directory [Tools] contains a tarball of simple cpuset scripts
to make, query, modify, "enter" and run commands in cpusets. There may
be other versions of such scripts around. If you don't already have
any, feel free to grab them.

Since you've expressed interest in this [as has Kamezawa-san], I'll try
to pay some attention to debugging the patches in my copious spare time.
And, I'd be very interested in anything you discover in your
investigations.

Regards,
Lee

Memtoy-0.19c [for latest MPOL_MF flags defs]:

!!! lines are my annotations:

memtoy pid: 4222
memtoy>mems
mems allowed = 0-3
mems policy = 0-3
memtoy>cpus
cpu affinity mask/ids: 0-7
memtoy>anon a1 8p
memtoy>map a1
memtoy>mbind a1 pref 1
memtoy>touch a1 w
memtoy: touched 8 pages in 0.000 secs
memtoy>where a1
a 0x00007f51ae757000 0x000000008000 0x000000000000 rw- default a1
page offset +00 +01 +02 +03 +04 +05 +06 +07
0: 1 1 1 1 1 1 1 1
memtoy>mbind a1 pref+move 2
memtoy: migration of a1 [8 pages] took 0.000secs.

memtoy>where a1
a 0x00007f51ae757000 0x000000008000 0x000000000000 rw- default a1
page offset +00 +01 +02 +03 +04 +05 +06 +07
0: 2 2 2 2 2 2 2 2

!!! direct migration [still] works! Try lazy:

memtoy>mbind a1 pref+move+lazy 3
memtoy: unmap of a1 [8 pages] took 0.000secs.
memtoy>where a1

!!! "where" command uses get_mempolicy() w/ MPOL_ADDR|MPOL_NODE flags to
fetch page location. Will call get_user_pages() and refault pages.
Should migrate to node 3, but:

a 0x00007f51ae757000 0x000000008000 0x000000000000 rw- default a1
page offset +00 +01 +02 +03 +04 +05 +06 +07
0: 2 2 2 2 2 2 2 2
!!! didn't move
memtoy>exit


On console I see, for each of 8 pages of segment a1:

BUG: Bad page state in process memtoy pfn:67515f
page:ffffea001699ccc8 flags:0a0000000010001d count:0 mapcount:0
mapping:(null) index:7f51ae75e
Pid: 4222, comm: memtoy Not tainted 2.6.30-mmotm-090612-1220+spol+lpm #6
Call Trace:
[<ffffffff810a787a>] bad_page+0xaa/0x130
[<ffffffff810a8719>] free_hot_cold_page+0x199/0x1d0
[<ffffffff810a8774>] __pagevec_free+0x24/0x30
[<ffffffff810ac96a>] release_pages+0x1ca/0x210
[<ffffffff810c8b7d>] free_pages_and_swap_cache+0x8d/0xb0
[<ffffffff810c0505>] exit_mmap+0x145/0x160
[<ffffffff81044177>] mmput+0x47/0xa0
[<ffffffff81048854>] exit_mm+0xf4/0x130
[<ffffffff81049c58>] do_exit+0x188/0x810
[<ffffffff81337194>] ? do_page_fault+0x184/0x310
[<ffffffff8104a31e>] do_group_exit+0x3e/0xa0
[<ffffffff8104a392>] sys_exit_group+0x12/0x20
[<ffffffff8100bd2b>] system_call_fastpath+0x16/0x1b


Page flags 0x10001d: locked, referenced, uptodate, dirty, swapbacked.
'locked' is bad state.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/