On 4/8/25 16:25, David Hildenbrand wrote:
On 08.04.25 16:18, Harry Yoo wrote:
On Tue, Apr 08, 2025 at 12:17:52PM +0200, David Hildenbrand wrote:
On 08.04.25 10:41, Oscar Salvador wrote:
Currently, slab_mem_going_going_callback() checks whether the node has
N_NORMAL memory in order to be set in slab_nodes.
While it is true that gettind rid of that enforcing would mean
ending up with movables nodes in slab_nodes, the memory waste that comes
with that is negligible.
So stop checking for status_change_nid_normal and just use status_change_nid
instead which works for both types of memory.
Also, once we allocate the kmem_cache_node cache for the node in
slab_mem_online_callback(), we never deallocate it in
slab_mem_off_callback() when the node goes memoryless, so we can just
get rid of it.
The only side effect is that we will stop clearing the node from slab_nodes.
Feel free to add a Suggested-by: if you think it applies.
Do we have to take care of the N_NORMAL_MEMORY check in kmem_cache_init() ? Likely it
would have to be a N_MEMORY check.
But, I was wondering if we could get rid of the "slab_nodes" thingy as a first step?
The following commit says that SLUB has slab_nodes thingy for a reason...
kmem_cache_node might not be ready yet even when N_NORMAL_MEMORY check
says it now has normal memory.
node_states_set_node() is called from memory hotplug code after
MEM_GOING_ONLINE and after online_pages_range().
Pages might be isolated at that point, but node_states_set_node() is set
only after the memory notifier (MEM_GOING_ONLINE) was triggered.
So I don't immediately see the problem assuming that we never free the
structures.
Hm, I think it still exists. So consider:
- slab_mem_going_online_callback() creates kmem_cache_node for the new node
for existing caches under the mutex
Then a cache creation races with "node_states_set_node() is set only after
the memory notifier (MEM_GOING_ONLINE) was triggered" in a way that it
doesn't see the node set yet - kmem_cache_node for the new node on the new
cache will be missing.
The root issue is node_states_set_node() doesn't happen under slab_mutex.
slab_nodes update happens under the slab_mutex to avoid this race.