Re: [PATCH RESEND] sched/numa: expose per-task pages-migration-failure

From: Ingo Molnar
Date: Tue Dec 03 2019 - 02:16:21 EST



* çè <yun.wang@xxxxxxxxxxxxxxxxx> wrote:

> NUMA balancing will try to migrate pages between nodes, which
> could caused by memory policy or numa group aggregation, while
> the page migration could failed too for eg when the target node
> run out of memory.
>
> Since this is critical to the performance, admin should know
> how serious the problem is, and take actions before it causing
> too much performance damage, thus this patch expose the counter
> as 'migfailed' in '/proc/PID/sched'.
>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Michal Koutnà <mkoutny@xxxxxxxx>
> Acked-by: Mel Gorman <mgorman@xxxxxxx>
> Signed-off-by: Michael Wang <yun.wang@xxxxxxxxxxxxxxxxx>
> ---
> kernel/sched/debug.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index f7e4579e746c..73c4809c8f37 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -848,6 +848,7 @@ static void sched_show_numa(struct task_struct *p, struct seq_file *m)
> P(total_numa_faults);
> SEQ_printf(m, "current_node=%d, numa_group_id=%d\n",
> task_node(p), task_numa_group_id(p));
> + SEQ_printf(m, "migfailed=%lu\n", p->numa_faults_locality[2]);

Any reason not to expose the other 2 fields of this array as well, which
show remote/local migrations?

Thanks,

Ingo