Re: [PATCH v4 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages

From: Vlastimil Babka
Date: Fri Dec 01 2017 - 10:20:16 EST


On 12/01/2017 10:55 AM, Yisheng Xie wrote:
> As in manpage of migrate_pages, the errno should be set to EINVAL when
> none of the node IDs specified by new_nodes are on-line and allowed by the
> process's current cpuset context, or none of the specified nodes contain
> memory. However, when test by following case:
>
> new_nodes = 0;
> old_nodes = 0xf;
> ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>
> The ret will be 0 and no errno is set. As the new_nodes is empty, we
> should expect EINVAL as documented.
>
> To fix the case like above, this patch check whether target nodes AND
> current task_nodes is empty, and then check whether AND
> node_states[N_MEMORY] is empty.
>
> Meanwhile,this patch also remove the check of EPERM on CAP_SYS_NICE.
> The caller of migrate_pages should be able to migrate the target process
> pages anywhere the caller can allocate memory, if the caller can access
> the mm_struct.
>
> Signed-off-by: Yisheng Xie <xieyisheng1@xxxxxxxxxx>
> Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> Cc: Chris Salls <salls@xxxxxxxxxxx>
> Cc: Christopher Lameter <cl@xxxxxxxxx>
> Cc: David Rientjes <rientjes@xxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
> Cc: Tan Xiaojun <tanxiaojun@xxxxxxxxxx>
> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> ---
> v3:
> * check whether node is empty after AND current task node, and then nodes
> which have memory
> v4:
> * remove the check of EPERM on CAP_SYS_NICE.
>
> Hi Vlastimil and Christopher,
>
> Could you please help to review this version?

Hi, I think we should stay with v3 after all. What I missed when
reviewing it, is that the EPERM check is for cpuset_mems_allowed(task)
and in v3 you add EINVAL check for cpuset_mems_allowed(current), which
may not be the same, and the intention of CAP_SYS_NICE is not whether we
can bypass our own cpuset, but whether we can bypass the target task's
cpuset. Sorry for the confusion.

> Thanks
> Yisheng Xie
>
> mm/mempolicy.c | 13 +++++--------
> 1 file changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 65df28d..4da74b6 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1426,17 +1426,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
> }
> rcu_read_unlock();
>
> - task_nodes = cpuset_mems_allowed(task);
> - /* Is the user allowed to access the target nodes? */
> - if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
> - err = -EPERM;
> + task_nodes = cpuset_mems_allowed(current);
> + nodes_and(*new, *new, task_nodes);
> + if (nodes_empty(*new))
> goto out_put;
> - }
>
> - if (!nodes_subset(*new, node_states[N_MEMORY])) {
> - err = -EINVAL;
> + nodes_and(*new, *new, node_states[N_MEMORY]);
> + if (nodes_empty(*new))
> goto out_put;
> - }
>
> err = security_task_movememory(task);
> if (err)
>