Re: [PATCH v2 2/4] mm: handle uninitialized numa nodes gracefully
From: Michal Hocko
Date: Fri Jan 14 2022 - 05:01:53 EST
On Fri 14-01-22 00:24:15, Wei Yang wrote:
> On Tue, Dec 14, 2021 at 11:38:47AM +0100, Michal Hocko wrote:
> >On Tue 14-12-21 11:33:41, Christoph Lameter wrote:
> >> On Tue, 14 Dec 2021, Michal Hocko wrote:
> >>
> >> > This patch takes a different approach (following a lead of [3]) and it
> >> > pre allocates pgdat for all possible nodes in an arch indipendent code
> >> > - free_area_init. All uninitialized nodes are treated as memoryless
> >> > nodes. node_state of the node is not changed because that would lead to
> >> > other side effects - e.g. sysfs representation of such a node and from
> >> > past discussions [4] it is known that some tools might have problems
> >> > digesting that.
> >>
> >> Would it be possible to define a pgdat statically and place it in read
> >> only memory? Populate with values that ensure that the page allocator
> >> does not blow up but does a defined fallback.
> >>
> >> Point the pgdat for all nodes not online to that readonly pgdat?
> >>
> >> Maybe that would save some memory. When the node comes online then a real
> >> pgdat could be allocated.
> >
> >This is certainly possible but also it is more complex. I aim for as
> >simple as possible at this stage. The reason I am not concerned about
> >memory overhead so much (even though the pgdat is a large data
> >structure) is that these unpopulated nodes are rather rare. We might see
> >more of them in the future but we are not quite there yet so I do not
> >think this is a major obstacle for now.
>
> Another thing is we still have a chance to get NULL NODE_DATA if we failed to
> allocate it. And this is the problem we want to address here.
System that is short on memory that early in the boot to fail this
allocation is very likely not going to finish the boot. I do not think
we can make any reasonable allocation failure handling here.
--
Michal Hocko
SUSE Labs