Re: [PATCH] mm/sparse: Fix flags overlap in section_mem_map

From: David Hildenbrand
Date: Tue Apr 27 2021 - 05:05:25 EST


On 27.04.21 10:30, Wang Wensheng wrote:
The section_mem_map member of struct mem_section stores some flags and
the address of struct page array of the mem_section.

Additionally the node id of the mem_section is stored during early boot,
where the struct page array has not been allocated. In other words, the
higher bits of section_mem_map are used for two purpose, and the node id
should be clear properly after the early boot.

Currently the node id field is overlapped with the flag field and cannot
be clear properly. That overlapped bits would then be treated as
mem_section flags and may lead to unexpected side effects.

Define SECTION_NID_SHIFT using order_base_2 to ensure that the node id
field always locates after flags field. That's why the overlap occurs -
forgetting to increase SECTION_NID_SHIFT when adding new mem_section
flag.

Fixes: 326e1b8f83a4 ("mm/sparsemem: introduce a SECTION_IS_EARLY flag")
Signed-off-by: Wang Wensheng <wangwensheng4@xxxxxxxxxx>
---
include/linux/mmzone.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 47946ce..b01694d 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1325,7 +1325,7 @@ extern size_t mem_section_usage_size(void);
#define SECTION_TAINT_ZONE_DEVICE (1UL<<4)
#define SECTION_MAP_LAST_BIT (1UL<<5)
#define SECTION_MAP_MASK (~(SECTION_MAP_LAST_BIT-1))
-#define SECTION_NID_SHIFT 3
+#define SECTION_NID_SHIFT order_base_2(SECTION_MAP_LAST_BIT)
static inline struct page *__section_mem_map_addr(struct mem_section *section)
{


Well, all sections around during boot that have an early NID are early ... so it's not an issue with SECTION_IS_EARLY, no? I mean, it's ugly, but not broken.

But it's an issue with SECTION_TAINT_ZONE_DEVICE, AFAIKT. sparse_init_one_section() would leave the bit set if the nid happens to have that bit set (e.g., node 2,3). It's semi-broken then, because we force all pfn_to_online_page() through the slow path.


That whole section flag setting code is fragile.

--
Thanks,

David / dhildenb