Re: [PATCH V2] mm/vmstat: Add events for THP migration without split

From: Zi Yan
Date: Thu Jun 04 2020 - 09:51:22 EST


On 4 Jun 2020, at 7:34, Matthew Wilcox wrote:

> On Thu, Jun 04, 2020 at 09:30:45AM +0530, Anshuman Khandual wrote:
>> Add the following new VM events which will help in validating THP migration
>> without split. Statistics reported through these new events will help in
>> performance debugging.
>>
>> 1. THP_MIGRATION_SUCCESS
>> 2. THP_MIGRATION_FAILURE
>>
>> THP_MIGRATION_FAILURE in particular represents an event when a THP could
>> not be migrated as a single entity following an allocation failure and
>> ended up getting split into constituent normal pages before being retried.
>> This event, along with PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in
>> quantifying and analyzing THP migration events including both success and
>> failure cases.
>
>> +Quantifying Migration
>> +=====================
>> +Following events can be used to quantify page migration.
>> +
>> +- PGMIGRATE_SUCCESS
>> +- PGMIGRATE_FAIL
>> +- THP_MIGRATION_SUCCESS
>> +- THP_MIGRATION_FAILURE
>> +
>> +THP_MIGRATION_FAILURE in particular represents an event when a THP could not be
>> +migrated as a single entity following an allocation failure and ended up getting
>> +split into constituent normal pages before being retried. This event, along with
>> +PGMIGRATE_SUCCESS and PGMIGRATE_FAIL will help in quantifying and analyzing THP
>> +migration events including both success and failure cases.
>
> First, I'd suggest running this paragraph through 'fmt'. That way you
> don't have to care about line lengths.
>
> Second, this paragraph doesn't really explain what I need to know to
> understand the meaning of these numbers. When Linux attempts to migrate
> a THP, one of three things can happen:
>
> - It is migrated as a single THP
> - It is migrated, but had to be split
> - Migration fails
>
> How do I turn these four numbers into an understanding of how often each
> of those three situations happen? And why do we need four numbers to
> report three situations?
>
> Or is there something else that can happen? If so, I'd like that explained
> here too ;-)

PGMIGRATE_SUCCESS and PGMIGRATE_FAIL record a combination of different events,
so it is not easy to interpret them. Let me try to explain them.

1. migrating only base pages: PGMIGRATE_SUCCESS and PGMIGRATE_FAIL just mean
these base pages are migrated and fail to migrate respectively.
THP_MIGRATION_SUCCESS and THP_MIGRATION_FAILURE should be 0 in this case.
Simple.

2. migrating only THPs:
- PGMIGRATE_SUCCESS means THPs that are migrated and base pages
(from the split of THPs) that are migrated,

- PGMIGRATE_FAIL means THPs that fail to migrate and base pages that fail to migrated.

- THP_MIGRATION_SUCCESS means THPs that are migrated.

- THP_MIGRATION_FAILURE means THPs that are split.

So PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS means the number of migrated base pages,
which are from the split of THPs.

When it comes to analyze failed migration, PGMIGRATE_FAIL - THP_MIGRATION_FAILURE
means the number of pages that are failed to migrate, but we cannot tell how many
are base pages and how many are THPs.

3. migrating base pages and THP:

The math should be very similar to the second case, except that
a) from PGMIGRATE_SUCCESS - THP_MIGRATION_SUCCESS, we cannot tell how many are pages begin
as base pages and how many are pages begin as THPs but become base pages after split;
b) from PGMIGRATE_FAIL - THP_MIGRATION_FAILURE, an additional case,
base pages that begin as base pages fail to migrate, is mixed into the number and we
cannot tell three cases apart.


â
Best Regards,
Yan Zi

Attachment: signature.asc
Description: OpenPGP digital signature