Re: mm: Regression with v7.0-rc1 on RISC-V

From: Ron Economos

Date: Tue Feb 24 2026 - 15:56:28 EST


On 2/24/26 09:29, Zi Yan wrote:
On 24 Feb 2026, at 12:17, Zi Yan wrote:

On 24 Feb 2026, at 12:14, Zi Yan wrote:

On 24 Feb 2026, at 12:07, David Hildenbrand wrote:

David Hildenbrand (Arm) <david@xxxxxxxxxx> hat am 24.02.2026 12:00 CET geschrieben:





On 2/24/26 09:37, Ron Economos wrote:

I'm getting a BUG dump during shutdown with Linux v7.0-rc1 on RISC-V.



[ OK ] Reached target shutdown.target - System Shutdown.

[ OK ] Reached target final.target - Late Shutdown Services.

[ OK ] Finished systemd-reboot.service - System Reboot.

[ OK ] Reached target reboot.target - System Reboot.

[ 173.985249] BUG: Bad page state in process shutdown pfn:f8850

[ 173.985311] page: refcount:1 mapcount:0 mapping:0000000000000000

index:0x0 pfn:0xf8850

[ 173.985336] flags: 0xffff80000000000(node=0|zone=0|

lastcpupid=0x1ffff) CMA

[ 173.985365] raw: 0ffff80000000000 ffffffc501e21448 ffffffc600f2ae88

0000000000000000

[ 173.985386] raw: 0000000000000000 0000000000000000 00000001ffffffff

0000000000000000

[ 173.985403] page dumped because: nonzero _refcount
So, we're freeing something from CMA in cma_release().



In cma_release() we iterate all pages to decrement their refcount



VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn)));



I would expect that this would fire already if there is still a page

referenced.



Are you running with CONFIG_DEBUG_VM=y ?





--

Cheers,



David
Thinking again without my computer at hand … isn‘t the call completely optimized out without CONFIG_DEBUG_VM?



At least that’s what I remember.
Right. Without CONFIG_DEBUG_VM=y, VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn)))
and is_check_pages_enabled(), which leads to free_page_is_bad()’s
“page dumped because: nonzero _refcount”, are disabled.

It seems to me that someone else bump the page refcount between
VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn))) and free_page_is_bad().

Merging Ron’s reply from another thread[1]:

“Something strange is going on. I enabled CONFIG_DEBUG_VM by itself and
the issue went away. Let me try CONFIG_DEBUG_PAGE_REF.”

Looks like something is racy, since it is reproducible reliably.

[1] https://lore.kernel.org/all/30dd1efc-9bd9-4664-999e-610d181600f9@xxxxxxxx/
VM_WARN_ON() is BUILD_BUG_ON_INVALID() when CONFIG_DEBUG_VM is off. Only
the validity of the expression is checked and no code is generated.
So that put_page_testzero() becomes a NOP.

Hi Ron,

Can you check if the patch below fix the issue without CONFIG_DEBUG_VM?

diff --git a/mm/cma.c b/mm/cma.c
index 94b5da468a7d..96be62eb3713 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -1020,8 +1020,11 @@ bool cma_release(struct cma *cma, const struct page *pages,
return false;

pfn = page_to_pfn(pages);
- for (i = 0; i < count; i++, pfn++)
- VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn)));
+ for (i = 0; i < count; i++, pfn++) {
+ int __maybe_unused ret = put_page_testzero(pfn_to_page(pfn));
+
+ VM_WARN_ON(!ret);
+ }

__cma_release_frozen(cma, cmr, pages, count);



Best Regards,
Yan, Zi

Yes, that patch fixes the issue.