Re: [PATCH v6 1/2] sched/numa: introduce per-cgroup NUMA locality info

From: Michal Koutný
Date: Fri Jan 03 2020 - 10:15:00 EST


Hi.

On Fri, Dec 13, 2019 at 09:47:36AM +0800, çè <yun.wang@xxxxxxxxxxxxxxxxx> wrote:
> By monitoring the increments, we will be able to locate the per-cgroup
> workload which NUMA Balancing can't helpwith (usually caused by wrong
> CPU and memory node bindings), then we got chance to fix that in time.
I just wonder do the data based on increments match with those you
obtained previously?

> +static inline void
> +update_task_locality(struct task_struct *p, int pnid, int cnid, int pages)
> +{
> + if (!static_branch_unlikely(&sched_numa_locality))
> + return;
> +
> + /*
> + * pnid != cnid --> remote idx 0
> + * pnid == cnid --> local idx 1
> + */
> + p->numa_page_access[!!(pnid == cnid)] += pages;
If the per-task information isn't used anywhere, why not accumulate
directly into task's cfs_rq->{local,remote}_page_access?

> @@ -4298,6 +4359,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued)
> */
> update_load_avg(cfs_rq, curr, UPDATE_TG);
> update_cfs_group(curr);
> + update_group_locality(cfs_rq);
With the per-NUMA node time tracked separately, isn't it unnecessary
doing group updates inside entity_tick?


Regards,
Michal