Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone

From: Wei Yang
Date: Wed Feb 08 2017 - 09:07:08 EST


On Tue, Feb 07, 2017 at 04:41:21PM +0100, Michal Hocko wrote:
>On Tue 07-02-17 23:32:47, Wei Yang wrote:
>> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
>[...]
>> >Is there any reason why for_each_mem_pfn_range cannot be changed to
>> >honor the given start/end pfns instead? I can imagine that a small zone
>> >would see a similar pointless iterations...
>> >
>>
>> Hmm... No special reason, just not thought about this implementation. And
>> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
>> which also return 0 when there is no overlap.
>>
>> BTW, I don't get your point. You wish to put the check in
>> for_each_mem_pfn_range() definition?
>
>My point was that you are handling one special case (an empty zone) but
>the underlying problem is that __absent_pages_in_range might be wasting
>cycles iterating over memblocks that are way outside of the given pfn
>range. At least this is my understanding. If you fix that you do not
>need the special case, right?

Yep, I think this is a good suggestion. By doing do, this could save iterating
cycles in __absent_pages_in_range().

Hmm, the case is a little bit different in zone_absent_pages_in_node() in case
there is movable zone in this node. Even __absent_pages_in_range() returns 0,
it is not a proof that this node has no page in this zone. Which means, we
still need to go through the ZONE_MOVABLE handling part, which is a memblock
iteration too.

Let's take a look whether guard __absent_pages_in_range() internally is
necessary now.

The function itself is invoked at three places:

* numa_meminfo_cover_memory()
* zone_absent_pages_in_node()
* absent_pages_in_range()

The first one is invoked on numa_meminfo which is sanitized by
numa_cleanup_meminfo().
The second one is analysed here.

The third one is invoked at two places:
* numa_meminfo_cover_memory()
* mem_hole_size()

At the first place, it is passed with (0, max_pfn) as parameter, which I think
is not common to have max_pfn to be 0.
At the second place, the start_pfn and end_pfn is already guarded.

With all those status, currently I choose to put the check in
zone_absent_pages_in_node().

BTW, the ZONE_MOVABLE handling looks strange to me and the comment "Treat
pages to be ZONE_MOVABLE in ZONE_NORMAL as absent pages and vice versa" is
hard to understand. From the code point of view, if zone_type is ZONE_NORMAL,
each memblock region between zone_start_pfn and zone_end_pfn would be treated
as absent pages if it is not mirrored. Do you have some hint on this?

>--
>Michal Hocko
>SUSE Labs

--
Wei Yang
Help you, Help me

Attachment: signature.asc
Description: PGP signature