Re: [PATCH -mm] autonuma: Fix scan period updating
From: Huang\, Ying
Date: Wed Jul 03 2019 - 20:32:12 EST
Mel Gorman <mgorman@xxxxxxx> writes:
> On Tue, Jun 25, 2019 at 09:23:22PM +0800, huang ying wrote:
>> On Mon, Jun 24, 2019 at 10:25 PM Mel Gorman <mgorman@xxxxxxx> wrote:
>> > On Mon, Jun 24, 2019 at 10:56:04AM +0800, Huang Ying wrote:
>> > > The autonuma scan period should be increased (scanning is slowed down)
>> > > if the majority of the page accesses are shared with other processes.
>> > > But in current code, the scan period will be decreased (scanning is
>> > > speeded up) in that situation.
>> > >
>> > > This patch fixes the code. And this has been tested via tracing the
>> > > scan period changing and /proc/vmstat numa_pte_updates counter when
>> > > running a multi-threaded memory accessing program (most memory
>> > > areas are accessed by multiple threads).
>> > >
>> > The patch somewhat flips the logic on whether shared or private is
>> > considered and it's not immediately obvious why that was required. That
>> > aside, other than the impact on numa_pte_updates, what actual
>> > performance difference was measured and on on what workloads?
>> The original scanning period updating logic doesn't match the original
>> patch description and comments. I think the original patch
>> description and comments make more sense. So I fix the code logic to
>> make it match the original patch description and comments.
>> If my understanding to the original code logic and the original patch
>> description and comments were correct, do you think the original patch
>> description and comments are wrong so we need to fix the comments
>> instead? Or you think we should prove whether the original patch
>> description and comments are correct?
> I'm about to get knocked offline so cannot answer properly. The code may
> indeed be wrong and I have observed higher than expected NUMA scanning
> behaviour than expected although not enough to cause problems. A comment
> fix is fine but if you're changing the scanning behaviour, it should be
> backed up with data justifying that the change both reduces the observed
> scanning and that it has no adverse performance implications.
Got it! Thanks for comments! As for performance testing, do you have
some candidate workloads?