Re: [PATCH V4] mm: Add sysfs interface to dump each node's zonelist information

From: Anshuman Khandual
Date: Sat Sep 17 2016 - 00:27:24 EST


On 09/12/2016 11:43 PM, David Rientjes wrote:
> On Mon, 12 Sep 2016, Anshuman Khandual wrote:
>
>>>>> after memory or node hot[un]plug is desirable. This change adds one
>>>>> new sysfs interface (/sys/devices/system/memory/system_zone_details)
>>>>> which will fetch and dump this information.
>>> Doesn't this violate the "one value per file" sysfs rule? Does it
>>> belong in debugfs instead?
>>
>> Yeah sure. Will make it a debugfs interface.
>>
>
> So the intended reader of this file is running as root?

Yeah.

>
>>> I also really question the need to dump kernel addresses out, filtered
>>> or not. What's the point?
>>
>> Hmm, thought it to be an additional information. But yes its additional
>> and can be dropped.
>>
>
> I'm questioning if this information can be inferred from information
> already in /proc/zoneinfo and sysfs. We know the no-fallback zonelist is
> going to include the local node, and we know the other zonelists are
> either node ordered or zone ordered (or do we need to extend
> vm.numa_zonelist_order for default?). I may have missed what new
> knowledge this interface is imparting on us.

IIUC /proc/zoneinfo lists down zone internal state and statistics for
all zones on the system at any given point of time. The no-fallback
list contains the zones from the local node and fallback (which gets
used more often than the no-fallback) list contains all zones either
in node-ordered or zone-ordered manner. In most of the platforms the
default being the node order but the sequence of present nodes in
that order is determined by various factors like NUMA distance, load,
presence of CPUs on the node etc. This order of nodes in the fallback
list is the most important information derived out of this interface.