Re: [patch v3] mm, page_alloc: reintroduce page allocation stall warning

From: Shakeel Butt

Date: Mon Mar 30 2026 - 23:02:50 EST


On Mon, Mar 30, 2026 at 06:20:57PM -0700, David Rientjes wrote:
> Previously, we had warnings when a single page allocation took longer
> than reasonably expected. This was introduced in commit 63f53dea0c98
> ("mm: warn about allocations which stall for too long").
>
> The warning was subsequently reverted in commit 400e22499dd9 ("mm: don't
> warn about allocations which stall for too long") because it was possible
> to generate memory pressure that would effectively stall further progress
> through printk execution.
>
> Page allocation stalls in excess of 10 seconds are always useful to debug
> because they can result in severe userspace unresponsiveness. Adding
> this artifact can be used to correlate with userspace going out to lunch
> and to understand the state of memory at the time.
>
> There should be a reasonable expectation that this warning will never
> trigger given it is very passive, it will only be emitted when a page
> allocation takes longer than 10 seconds. If it does trigger, this
> reveals an issue that should be fixed: a single page allocation should
> never loop for more than 10 seconds without oom killing to make memory
> available.
>
> Unlike the original implementation, this implementation only reports
> stalls once for the system every 10 seconds. Otherwise, many concurrent
> reclaimers could spam the kernel log unnecessarily. Stalls are only
> reported when calling into direct reclaim.
>
> Acked-by: Vlastimil Babka (SUSE) <vbabka@xxxxxxxxxx>
> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>

Reviewed-by: Shakeel Butt <shakeel.butt@xxxxxxxxx>

I am hoping that the reason you are reintroducing these warnings is
because you already are seeing such cases in your production
environment. Do you have anything interesting to share?