[PATCH 3.16.y-ckt 098/129] mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress
From: Luis Henriques
Date: Fri Feb 26 2016 - 05:40:00 EST
3.16.7-ckt25 -stable review patch. If anyone has any objections, please let me know.
---8<------------------------------------------------------------
From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
commit 564e81a57f9788b1475127012e0fd44e9049e342 upstream.
Jan Stancek has reported that system occasionally hanging after "oom01"
testcase from LTP triggers OOM. Guessing from a result that there is a
kworker thread doing memory allocation and the values between "Node 0
Normal free:" and "Node 0 Normal:" differs when hanging, vmstat is not
up-to-date for some reason.
According to commit 373ccbe59270 ("mm, vmstat: allow WQ concurrency to
discover memory reclaim doesn't make any progress"), it meant to force
the kworker thread to take a short sleep, but it by error used
schedule_timeout(1). We missed that schedule_timeout() in state
TASK_RUNNING doesn't do anything.
Fix it by using schedule_timeout_uninterruptible(1) which forces the
kworker thread to take a short sleep in order to make sure that vmstat
is up-to-date.
Fixes: 373ccbe59270 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress")
Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Reported-by: Jan Stancek <jstancek@xxxxxxxxxx>
Acked-by: Michal Hocko <mhocko@xxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Cristopher Lameter <clameter@xxxxxxx>
Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
Cc: Arkadiusz Miskiewicz <arekm@xxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Luis Henriques <luis.henriques@xxxxxxxxxxxxx>
---
mm/backing-dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 035be81fb150..afc8593327d6 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -643,7 +643,7 @@ long wait_iff_congested(struct zone *zone, int sync, long timeout)
* here rather than calling cond_resched().
*/
if (current->flags & PF_WQ_WORKER)
- schedule_timeout(1);
+ schedule_timeout_uninterruptible(1);
else
cond_resched();