Re: [RFC] mm, page_alloc: reintroduce page allocation stall warning

From: Andrew Morton

Date: Mon Mar 23 2026 - 15:09:22 EST


On Sat, 21 Mar 2026 20:03:16 -0700 (PDT) David Rientjes <rientjes@xxxxxxxxxx> wrote:

> Previously, we had warnings when a single page allocation took longer
> than reasonably expected. This was introduced in commit 63f53dea0c98
> ("mm: warn about allocations which stall for too long").
>
> The warning was subsequently reverted in commit 400e22499dd9 ("mm: don't
> warn about allocations which stall for too long") but for reasons
> unrelated to the warning itself.
>
> Page allocation stalls in excess of 10 seconds are always useful to debug
> because they can result in severe userspace unresponsiveness. Adding
> this artifact can be used to correlate with userspace going out to lunch
> and to understand the state of memory at the time.
>
> There should be a reasonable expectation that this warning will never
> trigger given it is very passive, it starts with a 10 second floor to
> begin with. If it does trigger, this reveals an issue that should be
> fixed: a single page allocation should never loop for more than 10
> seconds without oom killing to make memory available.
>
> Unlike the original implementation, this implementation only reports
> stalls that are at least a second longer than the longest stall reported
> thus far.

AI review: https://sashiko.dev/#/patchset/30945cc3-9c4d-94bb-e7e7-dde71483800c@xxxxxxxxxx

The warn_alloc_show_mem() inside spin_lock_irqsave() does sound
problematic.