On Tue 21-01-20 09:44:16, Wei Yang wrote:
On Mon, Jan 20, 2020 at 02:17:44PM +0100, Michal Hocko wrote:You are right. They are not archived for some reason. Anyway, the patch
On Mon 20-01-20 14:06:26, Michal Hocko wrote:Thanks, I see the change.
On Sat 18-01-20 13:26:43, Yang Shi wrote:OK, so I've double checked. do_move_page_to_node_array would carry the
The do_move_pages_to_node() might return > 0 value, the number of pagesThe patch is wrong. migrate_pages returns the number of pages it
that are not migrated, then the value will be returned to userspace
directly. But, move_pages() syscall would just return 0 or errno. So,
we need reset the return value to 0 for such case as what pre-v4.17 did.
_hasn't_ migrated or -errno. Yeah that semantic sucks but...
So err != 0 is always an error. Except err > 0 doesn't really provide
any useful information to the userspace. I cannot really remember what
was the actual behavior before my rework because there were some gotchas
hidden there.
error code over to do_pages_move and it would store the status stored
in the pm array. It contains page_to_nid(page) so the resulting code
indeed behaves properly before my change and this is a regression. I
have a very vague recollection that this has been brought up already.Well, the above two links return 404.
<...looks in notes...>
Found it! The report is
http://lkml.kernel.org/r/0329efa0984b9b0252ef166abb4498c0795fab36.1535113317.git.jstancek@xxxxxxxxxx
and my proposed workaround was http://lkml.kernel.org/r/20180829145537.GZ10223@xxxxxxxxxxxxxx
I was proposing back then is below:
commit cfb88c266b645197135cde2905c2bfc82f6d82a9
Author: Michal Hocko <mhocko@xxxxxxxx>
Date: Wed Nov 14 12:19:09 2018 +0100
mm: fix do_pages_move error reporting
a49bd4d71637 ("mm, numa: rework do_pages_move") has changed the way how
we report error to layers above. As the changelog mentioned the semantic
was quite unclear previously because the return 0 could mean both
success and failure.
The above mentioned commit didn't get all the way down to fix this
completely because it doesn't report pages that we even haven't
attempted to migrate and therefore we cannot simply say that the
semantic is:
- err < 0 - errno
- err >= 0 number of non-migrated pages.
Fixes: a49bd4d71637 ("mm, numa: rework do_pages_move")
Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
diff --git a/mm/migrate.c b/mm/migrate.c
index f7e4bfdc13b7..aa53ebc523eb 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1615,8 +1615,16 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
goto out_flush;
err = do_move_pages_to_node(mm, &pagelist, current_node);
- if (err)
+ if (err) {
+ /*
+ * Possitive err means the number of failed pages to
+ * migrate. Make sure to report the rest of the
+ * nr_pages is not migrated as well.
+ */
+ if (err > 0)
+ err += nr_pages - i - 1;
goto out;
+ }
if (i > start) {
err = store_status(status, start, current_node, i - start);
if (err)