On 6/20/19 6:08 PM, Yang Shi wrote:
Ah, I see. IIUC the current behavior is due to your recent commit
On 6/20/19 12:18 AM, Vlastimil Babka wrote:
On 6/19/19 8:19 PM, Yang Shi wrote:The current behavior is:
I think it does. There's:Yesïexactly.This is getting even more muddy TBH. Is there any reason that weI think it was already said that if pages can't be isolated, then
have to
handle this problem during the isolation phase rather the migration?
migration phase won't process them, so they're just ignored.
However I think the patch is wrong to abort immediately whenIt is fine too. I don't see mbind semantics define how to handle such
encountering such page that cannot be isolated (AFAICS). IMHO it should
still try to migrate everything it can, and only then return -EIO.
case other than returning -EIO.
If MPOL_MF_MOVE is specified in flags, then the kernel *will attempt to
move all the existing pages* ... If MPOL_MF_STRICT is also specified,
then the call fails with the error *EIO if some pages could not be moved*
Aborting immediately would be against the attempt to move all.
By looking into the code, it looks not that easy as what I thought.I think we just need to remember if there was at least one page that
do_mbind() would check the return value of queue_pages_range(), it just
applies the policy and manipulates vmas as long as the return value is 0
(success), then migrate pages on the list. We could put the movable
pages on the list by not breaking immediately, but they will be ignored.
If we migrate the pages regardless of the return value, it may break the
policy since the policy will *not* be applied at all.
failed isolation or migration, but keep working, and in the end return
EIO if there was such page(s). I don't think it breaks the policy. Once
pages are allocated in a mapping, changing the policy is a best effort
thing anyway.
If queue_pages_range() return -EIO (vma is not migratable, ignore other
conditions since we just focus on page migration), the policy won't be
set and no page will be migrated.
a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when
MPOL_MF_STRICT is specified") in order to fix commit 6f4576e3687b
("mempolicy: apply page table walker on queue_pages_range()"), which
caused -EIO to be not returned enough. But I think you went too far and
instead return -EIO too much. If I look at the code in parent commit of
6f4576e3687b, I can see in queue_pages_range():
if ((flags & MPOL_MF_STRICT) ||
((flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) &&
vma_migratable(vma))) {
err = queue_pages_pgd_range(vma, start, endvma, nodes,
flags, private);
if (err)
break;
}
and in queue_pages_pte_range():
if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
migrate_page_add(page, private, flags);
else
break;
So originally, there was no returning of -EIO due to !vma_migratable() -
as long as MPOL_MF_STRICT and MPOL_MF_MOVE* was specified, the code
tried to queue for migration everything it could and didn't ever abort,
AFAICS. And I still think that's the best possible behavior.
However, the problem here is the vma might look migratable, but some orSee above how the current behavior differs from the original one.
all the underlying pages are unmovable. So, my patch assumes the vma is
*not* migratable if at least one page is unmovable. I'm not sure if it
is possible to have both movable and unmovable pages for the same
mapping or not, I'm supposed the vma would be split much earlier.
If we don't abort immediately, then we record if there is unmovable
page, then we could do:
#1. Still follows the current behavior (then why not abort immediately?)
#2. Set mempolicy then migrate all the migratable pages. But, we may endThe policy can already be "broken" (violated is probably better word) by
up with the pages on node A, but the policy says node B. Doesn't it
break the policy?
migrate_pages() failing. If that happens, we don't rollback the migrated
pages and reset the policy back, right? I think the manpage is clear
that MPOL_MF_MOVE is a best-effort. Userspace will know that not
everything was successfully migrated (via -EIO), and can take whatever
steps it deems necessary - attempt rollback, determine which exact
page(s) are violating the policy, etc.