Re: [PATCH 1/2] sched/fair: Record the average duration of a task

From: Raghavendra K T
Date: Wed Jul 03 2024 - 09:40:30 EST




On 7/3/2024 5:27 PM, Mike Galbraith wrote:
On Wed, 2024-07-03 at 14:04 +0530, Raghavendra K T wrote:


On 7/1/2024 8:27 PM, Chen Yu wrote:

A thought occurred to me that one possible method to determine if the waker
and wakee share data could be to leverage the NUMA balance's numa_group data structure.
As numa balance periodically scans the task's VMA space and groups tasks accessing
the same physical page into one numa_group, we can infer that if the waker and wakee
are within the same numa_group, they are likely to share data, and it might be
appropriate to place the wakee on top of the waker.

CC Raghavendra here in case he has any insights.


Agree with your thought here,

So I imagine two possible things to explore here.

1) Use task1, task2 numa_group and check if they belong to same
numa_group, also check if there is a possibility of M:N relationship
by checking if t1/t2->numa_group->nr_tasks > 1 etc

2) Given a VMA we can use vma_numab_state pids_active[] if task1, task2
(threads) possibly interested in same VMA.
Latter one looks to be practically difficult because we don't want to
sweep across VMAs perhaps..

Oooh dear.. as soon as you mention threads, the question of who's
wheelhouse is this in springs to mind, ie should the kernel be
overriding userspace by targeting bits of threaded programs for forced
serialization?


Yes.. There is no ROI on this option (mentioned only for completeness).
also we are not looking beyond process. Rather than "Practically
difficult" I should have rephrased as Practically not an option.

Bah, think I'll just bugger off and let you guys have a go at making
this stacking business do less harm than good.

-Mike