Re: [PATCH 2/2] mm: Add sysfs interface to dump each node's zonelist information

From: Anshuman Khandual
Date: Fri Sep 02 2016 - 00:34:46 EST


On 09/01/2016 02:42 AM, Andrew Morton wrote:
> On Wed, 31 Aug 2016 08:55:50 +0530 Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx> wrote:
>
>> Each individual node in the system has a ZONELIST_FALLBACK zonelist
>> and a ZONELIST_NOFALLBACK zonelist. These zonelists decide fallback
>> order of zones during memory allocations. Sometimes it helps to dump
>> these zonelists to see the priority order of various zones in them.
>> This change just adds a sysfs interface for doing the same.
>>
>> Example zonelist information from a KVM guest.
>>
>> [NODE (0)]
>> ZONELIST_FALLBACK
>> (0) (node 0) (zone DMA c00000000140c000)
>> (1) (node 1) (zone DMA c000000100000000)
>> (2) (node 2) (zone DMA c000000200000000)
>> (3) (node 3) (zone DMA c000000300000000)
>> ZONELIST_NOFALLBACK
>> (0) (node 0) (zone DMA c00000000140c000)
>> [NODE (1)]
>> ZONELIST_FALLBACK
>> (0) (node 1) (zone DMA c000000100000000)
>> (1) (node 2) (zone DMA c000000200000000)
>> (2) (node 3) (zone DMA c000000300000000)
>> (3) (node 0) (zone DMA c00000000140c000)
>> ZONELIST_NOFALLBACK
>> (0) (node 1) (zone DMA c000000100000000)
>> [NODE (2)]
>> ZONELIST_FALLBACK
>> (0) (node 2) (zone DMA c000000200000000)
>> (1) (node 3) (zone DMA c000000300000000)
>> (2) (node 0) (zone DMA c00000000140c000)
>> (3) (node 1) (zone DMA c000000100000000)
>> ZONELIST_NOFALLBACK
>> (0) (node 2) (zone DMA c000000200000000)
>> [NODE (3)]
>> ZONELIST_FALLBACK
>> (0) (node 3) (zone DMA c000000300000000)
>> (1) (node 0) (zone DMA c00000000140c000)
>> (2) (node 1) (zone DMA c000000100000000)
>> (3) (node 2) (zone DMA c000000200000000)
>> ZONELIST_NOFALLBACK
>> (0) (node 3) (zone DMA c000000300000000)
>
> Can you please sell this a bit better? Why does it "sometimes help"?
> Why does the benefit of this patch to our users justify the overhead
> and cost?

On platforms which support memory hotplug into previously non existing
(at boot) zones, this interface helps in visualizing which zonelists
of the system, the new hot added memory ends up in. POWER is such a
platform where all the memory detected during boot time remains with
ZONE_DMA for good but then hot plug process can actually get new memory
into ZONE_MOVABLE. So having a way to get the snapshot of the zonelists
on the system after memory or node hot[un]plug is a good thing, IMHO.

>
> Please document the full path to the sysfs file(s) within the changelog.

Sure, will do.

>
> Please find somewhere in Documentation/ to document the new interface.
>

Sure, will create this following file describing the interface.

Documentation/ABI/testing/sysfs-system-zone-details