Re: [PATCH v3 15/16] mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG
From: Hao Ge
Date: Tue Jun 30 2026 - 00:35:06 EST
CC Andrew and Suren
On 2026/6/29 23:56, sashiko-bot@xxxxxxxxxx wrote:
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [Low] The fallback `#else` definition of `pgalloc_tag_add` was not updated to accept `unsigned int alloc_flags`.
Pre-existing issues:
- [Critical] TOCTOU race in `__alloc_tag_add_early_pfn` leading to an out-of-bounds array write.
--
commit ccee7f6d4bcf9d00da38c878e97dd852c5e649b3
Author: Brendan Jackman <jackmanb@xxxxxxxxxx>
mm: replace __GFP_NO_CODETAG with ALLOC_NO_CODETAG
This commit replaces the mm-internal __GFP_NO_CODETAG flag with a
new alloc_flags bit, ALLOC_NO_CODETAG. It plumbs alloc_flags through
several allocator functions to allow passing this flag directly to the
alloc_tag code.
diff --git a/mm/alloc_tag.c b/mm/alloc_tag.c[ ... ]
index d9be1cf5187d9..a32a94e759b94 100644
--- a/mm/alloc_tag.c
+++ b/mm/alloc_tag.c
@@ -806,7 +795,8 @@ static void __init __alloc_tag_add_early_pfn(unsigned long pfn)[Severity: Critical]
do {
pool = READ_ONCE(current_pfn_pool);
if (!pool || atomic_read(&pool->count) >= PFN_POOL_SIZE) {
- struct page *new_page = alloc_page(__GFP_HIGH | __GFP_NO_CODETAG);
+ struct page *new_page = __alloc_pages(__GFP_HIGH, 0, numa_mem_id(),
+ NULL, ALLOC_NO_CODETAG);
struct pfn_pool *new;
This is a pre-existing issue, but is there a time-of-check to time-of-use
race condition here?
If multiple CPUs are active during the deferred init window, could they
concurrently observe the same remaining capacity?
CPU1:
if (!pool || atomic_read(&pool->count) >= PFN_POOL_SIZE)
CPU2:
if (!pool || atomic_read(&pool->count) >= PFN_POOL_SIZE)
If both see 1 remaining slot, won't they both proceed to call
atomic_fetch_add() on the pool counter further down in this function?
This could cause an out-of-bounds array write at pool->pfns[PFN_POOL_SIZE].
It's interesting.
I didn't get this warning from Sashiko when I submitted the patch earlier,
I'm not sure what changed on its end to trigger it now.
No worries though, unless I've overlooked some detail,
I believe this is a false positive
Even if both CPUs see 1 remaining slot and both skip the pool-growing branch,
they still have to go through atomic_cmpxchg to actually claim it.
static void __init __alloc_tag_add_early_pfn(unsigned long pfn)
{
struct pfn_pool *pool;
int idx;
do {
pool = READ_ONCE(current_pfn_pool);
if (!pool || atomic_read(&pool->count) >= PFN_POOL_SIZE) {
.........
}
idx = atomic_read(&pool->count);
if (idx >= PFN_POOL_SIZE)
continue;
if (atomic_cmpxchg(&pool->count, idx, idx + 1) == idx)
break;
} while (1);
pool->pfns[idx] = pfn;
}
Only one wins, the other gets a CAS failure and loops back.
There's no path where both write to pfns[PFN_POOL_SIZE].
Given that, I think we can ignore this Sashiko report and
move forward with Brendan's patch.
Thanks
Best Regards
Hao
[ ... ]
diff --git a/mm/page_alloc.c b/mm/page_alloc.c[ ... ]
index 026f33f217036..803b32e5a5e47 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1263,17 +1263,17 @@ void __pgalloc_tag_add(struct page *page, struct task_struct *task,[Severity: Low]
alloc_tag_add_early_pfn(page_to_pfn(page), alloc_flags);
if (task->alloc_tag)
alloc_tag_set_inaccurate(task->alloc_tag);
}
}
static inline void pgalloc_tag_add(struct page *page, struct task_struct *task,
- unsigned int nr, gfp_t gfp_flags)
+ unsigned int nr, unsigned int alloc_flags)
{
if (mem_alloc_profiling_enabled())
- __pgalloc_tag_add(page, task, nr, gfp_flags);
+ __pgalloc_tag_add(page, task, nr, alloc_flags);
}
Does this change miss updating the fallback definition of pgalloc_tag_add()
when memory allocation profiling is disabled?
If the kernel is compiled without mem-alloc profiling, the dummy function
signature appears to still use the old type:
static inline void pgalloc_tag_add(struct page *page, struct task_struct *task,
unsigned int nr, gfp_t gfp_flags) {}
Since callers like post_alloc_hook() are now passing an unsigned int
for alloc_flags, won't this cause a Sparse type mismatch warning because
gfp_t is a restricted bitwise type?