v5.13 regression - suspend went belly up

From: Mike Galbraith
Date: Mon Jun 28 2021 - 09:33:47 EST


Greetings,

A regression popping up post rc7 is a bit unusual, but hohum, I suppose
they can bite whenever they damn well feel like it. This one was
bisected rc7..release, the (surprising to me) result then confirmed in
four disgruntled local trees.

b3b64ebd38225d8032b5db42938d969b602040c2 is the first bad commit
commit b3b64ebd38225d8032b5db42938d969b602040c2
Author: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Date: Thu Jun 24 18:40:07 2021 -0700

mm/page_alloc: do bulk array bounds check after checking populated elements

Dan Carpenter reported the following

The patch 0f87d9d30f21: "mm/page_alloc: add an array-based interface
to the bulk page allocator" from Apr 29, 2021, leads to the following
static checker warning:

mm/page_alloc.c:5338 __alloc_pages_bulk()
warn: potentially one past the end of array 'page_array[nr_populated]'

The problem can occur if an array is passed in that is fully populated.
That potentially ends up allocating a single page and storing it past
the end of the array. This patch returns 0 if the array is fully
populated.

Link: https://lkml.kernel.org/r/20210618125102.GU30378@xxxxxxxxxxxxxxxxxxx
Fixes: 0f87d9d30f21 ("mm/page_alloc: add an array-based interface to the bulk page allocator")
Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxxx>
Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
Cc: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>

mm/page_alloc.c | 4 ++++
1 file changed, 4 insertions(+)

[ 45.503409] PM: suspend entry (deep)
[ 45.529287] Filesystems sync: 0.026 seconds
[ 45.532934] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 45.534340] OOM killer disabled.
[ 45.534341] Freezing remaining freezable tasks ...
[ 65.505035] Freezing of tasks failed after 20.008 seconds (5 tasks refusing to freeze, wq_busy=0):
[ 65.505079] task:lockd state:S stack: 0 pid: 1809 ppid: 2 flags:0x00004000
[ 65.505093] Call Trace:
[ 65.505102] __schedule+0x28b/0x870
[ 65.505121] schedule+0x3c/0xa0
[ 65.505130] schedule_timeout+0x1d2/0x260
[ 65.505145] ? del_timer_sync+0x40/0x40
[ 65.505160] svc_recv+0xc3/0x8a0 [sunrpc]
[ 65.505310] ? grace_ender+0x10/0x10 [lockd]
[ 65.505331] lockd+0x8b/0x190 [lockd]
[ 65.505348] ? __kthread_parkme+0x4c/0x70
[ 65.505361] kthread+0x115/0x130
[ 65.505371] ? kthread_park+0x90/0x90
[ 65.505381] ret_from_fork+0x1f/0x30
[ 65.505401] task:nfsd state:S stack: 0 pid: 2044 ppid: 2 flags:0x00004000
[ 65.505413] Call Trace:
[ 65.505417] __schedule+0x28b/0x870
[ 65.505428] schedule+0x3c/0xa0
[ 65.505437] schedule_timeout+0x1d2/0x260
[ 65.505449] ? del_timer_sync+0x40/0x40
[ 65.505459] svc_recv+0xc3/0x8a0 [sunrpc]
[ 65.505572] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 65.505636] nfsd+0xd6/0x150 [nfsd]
[ 65.505691] kthread+0x115/0x130
[ 65.505701] ? kthread_park+0x90/0x90
[ 65.505710] ret_from_fork+0x1f/0x30
[ 65.505722] task:nfsd state:S stack: 0 pid: 2045 ppid: 2 flags:0x00004000
[ 65.505732] Call Trace:
[ 65.505737] __schedule+0x28b/0x870
[ 65.505748] schedule+0x3c/0xa0
[ 65.505757] schedule_timeout+0x1d2/0x260
[ 65.505769] ? del_timer_sync+0x40/0x40
[ 65.505779] svc_recv+0xc3/0x8a0 [sunrpc]
[ 65.505889] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 65.505945] nfsd+0xd6/0x150 [nfsd]
[ 65.505998] kthread+0x115/0x130
[ 65.506008] ? kthread_park+0x90/0x90
[ 65.506017] ret_from_fork+0x1f/0x30
[ 65.506029] task:nfsd state:S stack: 0 pid: 2046 ppid: 2 flags:0x00004000
[ 65.506039] Call Trace:
[ 65.506043] __schedule+0x28b/0x870
[ 65.506054] schedule+0x3c/0xa0
[ 65.506063] schedule_timeout+0x1d2/0x260
[ 65.506081] ? del_timer_sync+0x40/0x40
[ 65.506094] svc_recv+0xc3/0x8a0 [sunrpc]
[ 65.506202] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 65.506260] nfsd+0xd6/0x150 [nfsd]
[ 65.506314] kthread+0x115/0x130
[ 65.506327] ? kthread_park+0x90/0x90
[ 65.506339] ret_from_fork+0x1f/0x30
[ 65.506355] task:nfsd state:S stack: 0 pid: 2051 ppid: 2 flags:0x00004000
[ 65.506367] Call Trace:
[ 65.506374] __schedule+0x28b/0x870
[ 65.506387] schedule+0x3c/0xa0
[ 65.506399] schedule_timeout+0x1d2/0x260
[ 65.506414] ? del_timer_sync+0x40/0x40
[ 65.506426] svc_recv+0xc3/0x8a0 [sunrpc]
[ 65.506530] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 65.506608] nfsd+0xd6/0x150 [nfsd]
[ 65.506662] kthread+0x115/0x130
[ 65.506675] ? kthread_park+0x90/0x90
[ 65.506687] ret_from_fork+0x1f/0x30

[ 65.506780] Restarting kernel threads ... done.
[ 65.506968] OOM killer enabled.
[ 65.506973] Restarting tasks ... done.
[ 65.508047] PM: suspend exit
[ 65.508081] PM: suspend entry (s2idle)
[ 65.516062] Filesystems sync: 0.007 seconds
[ 65.516186] Freezing user space processes ... (elapsed 0.007 seconds) done.
[ 65.523663] OOM killer disabled.
[ 65.523664] Freezing remaining freezable tasks ...
[ 85.525885] Freezing of tasks failed after 20.004 seconds (9 tasks refusing to freeze, wq_busy=0):
[ 85.525937] task:lockd state:S stack: 0 pid: 1809 ppid: 2 flags:0x00004000
[ 85.525955] Call Trace:
[ 85.525966] __schedule+0x28b/0x870
[ 85.525989] schedule+0x3c/0xa0
[ 85.526001] schedule_timeout+0x1d2/0x260
[ 85.526017] ? del_timer_sync+0x40/0x40
[ 85.526033] svc_recv+0xc3/0x8a0 [sunrpc]
[ 85.526167] ? grace_ender+0x10/0x10 [lockd]
[ 85.526190] lockd+0x8b/0x190 [lockd]
[ 85.526210] ? __kthread_parkme+0x4c/0x70
[ 85.526225] kthread+0x115/0x130
[ 85.526237] ? kthread_park+0x90/0x90
[ 85.526250] ret_from_fork+0x1f/0x30
[ 85.526271] task:nfsd state:S stack: 0 pid: 2044 ppid: 2 flags:0x00004000
[ 85.526284] Call Trace:
[ 85.526291] __schedule+0x28b/0x870
[ 85.526305] schedule+0x3c/0xa0
[ 85.526317] schedule_timeout+0x1d2/0x260
[ 85.526332] ? del_timer_sync+0x40/0x40
[ 85.526344] svc_recv+0xc3/0x8a0 [sunrpc]
[ 85.526451] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 85.526511] nfsd+0xd6/0x150 [nfsd]
[ 85.526563] kthread+0x115/0x130
[ 85.526575] ? kthread_park+0x90/0x90
[ 85.526588] ret_from_fork+0x1f/0x30
[ 85.526602] task:nfsd state:S stack: 0 pid: 2045 ppid: 2 flags:0x00004000
[ 85.526615] Call Trace:
[ 85.526621] __schedule+0x28b/0x870
[ 85.526634] schedule+0x3c/0xa0
[ 85.526646] schedule_timeout+0x1d2/0x260
[ 85.526661] ? del_timer_sync+0x40/0x40
[ 85.526673] svc_recv+0xc3/0x8a0 [sunrpc]
[ 85.526776] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 85.526830] nfsd+0xd6/0x150 [nfsd]
[ 85.526879] kthread+0x115/0x130
[ 85.526892] ? kthread_park+0x90/0x90
[ 85.526904] ret_from_fork+0x1f/0x30
[ 85.526918] task:nfsd state:S stack: 0 pid: 2046 ppid: 2 flags:0x00004000
[ 85.526931] Call Trace:
[ 85.526937] __schedule+0x28b/0x870
[ 85.526951] schedule+0x3c/0xa0
[ 85.526962] schedule_timeout+0x1d2/0x260
[ 85.526977] ? del_timer_sync+0x40/0x40
[ 85.526989] svc_recv+0xc3/0x8a0 [sunrpc]
[ 85.527089] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 85.527142] nfsd+0xd6/0x150 [nfsd]
[ 85.527192] kthread+0x115/0x130
[ 85.527204] ? kthread_park+0x90/0x90
[ 85.527217] ret_from_fork+0x1f/0x30
[ 85.527231] task:nfsd state:S stack: 0 pid: 2047 ppid: 2 flags:0x00004000
[ 85.527242] Call Trace:
[ 85.527249] __schedule+0x28b/0x870
[ 85.527262] schedule+0x3c/0xa0
[ 85.527273] schedule_timeout+0x1d2/0x260
[ 85.527288] ? del_timer_sync+0x40/0x40
[ 85.527300] svc_recv+0xc3/0x8a0 [sunrpc]
[ 85.527397] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 85.527449] nfsd+0xd6/0x150 [nfsd]
[ 85.527497] kthread+0x115/0x130
[ 85.527510] ? kthread_park+0x90/0x90
[ 85.527522] ret_from_fork+0x1f/0x30
[ 85.527536] task:nfsd state:S stack: 0 pid: 2048 ppid: 2 flags:0x00004000
[ 85.527548] Call Trace:
[ 85.527554] __schedule+0x28b/0x870
[ 85.527568] schedule+0x3c/0xa0
[ 85.527579] schedule_timeout+0x1d2/0x260
[ 85.527594] ? del_timer_sync+0x40/0x40
[ 85.527606] svc_recv+0xc3/0x8a0 [sunrpc]
[ 85.527698] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 85.527749] nfsd+0xd6/0x150 [nfsd]
[ 85.527798] kthread+0x115/0x130
[ 85.527810] ? kthread_park+0x90/0x90
[ 85.527822] ret_from_fork+0x1f/0x30
[ 85.527837] task:nfsd state:S stack: 0 pid: 2049 ppid: 2 flags:0x00004000
[ 85.527848] Call Trace:
[ 85.527854] __schedule+0x28b/0x870
[ 85.527868] schedule+0x3c/0xa0
[ 85.527879] schedule_timeout+0x1d2/0x260
[ 85.527894] ? del_timer_sync+0x40/0x40
[ 85.527906] svc_recv+0xc3/0x8a0 [sunrpc]
[ 85.527999] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 85.528050] nfsd+0xd6/0x150 [nfsd]
[ 85.528098] kthread+0x115/0x130
[ 85.528110] ? kthread_park+0x90/0x90
[ 85.528122] ret_from_fork+0x1f/0x30
[ 85.528137] task:nfsd state:S stack: 0 pid: 2050 ppid: 2 flags:0x00004000
[ 85.528148] Call Trace:
[ 85.528154] __schedule+0x28b/0x870
[ 85.528167] schedule+0x3c/0xa0
[ 85.528178] schedule_timeout+0x1d2/0x260
[ 85.528193] ? del_timer_sync+0x40/0x40
[ 85.528205] svc_recv+0xc3/0x8a0 [sunrpc]
[ 85.528298] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 85.528363] nfsd+0xd6/0x150 [nfsd]
[ 85.528412] kthread+0x115/0x130
[ 85.528424] ? kthread_park+0x90/0x90
[ 85.528436] ret_from_fork+0x1f/0x30
[ 85.528450] task:nfsd state:S stack: 0 pid: 2051 ppid: 2 flags:0x00004000
[ 85.528462] Call Trace:
[ 85.528469] __schedule+0x28b/0x870
[ 85.528482] schedule+0x3c/0xa0
[ 85.528494] schedule_timeout+0x1d2/0x260
[ 85.528508] ? del_timer_sync+0x40/0x40
[ 85.528521] svc_recv+0xc3/0x8a0 [sunrpc]
[ 85.528616] ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[ 85.528668] nfsd+0xd6/0x150 [nfsd]
[ 85.528715] kthread+0x115/0x130
[ 85.528728] ? kthread_park+0x90/0x90
[ 85.528740] ret_from_fork+0x1f/0x30

[ 85.528831] Restarting kernel threads ... done.
[ 85.529020] OOM killer enabled.
[ 85.529024] Restarting tasks ... done.
[ 85.529753] PM: suspend exit