[PATCH 5.12 268/384] sched,psi: Handle potential task count underflow bugs more gracefully

From: Greg Kroah-Hartman
Date: Mon May 10 2021 - 08:37:28 EST


From: Charan Teja Reddy <charante@xxxxxxxxxxxxxx>

[ Upstream commit 9d10a13d1e4c349b76f1c675a874a7f981d6d3b4 ]

psi_group_cpu->tasks, represented by the unsigned int, stores the
number of tasks that could be stalled on a psi resource(io/mem/cpu).
Decrementing these counters at zero leads to wrapping which further
leads to the psi_group_cpu->state_mask is being set with the
respective pressure state. This could result into the unnecessary time
sampling for the pressure state thus cause the spurious psi events.
This can further lead to wrong actions being taken at the user land
based on these psi events.

Though psi_bug is set under these conditions but that just for debug
purpose. Fix it by decrementing the ->tasks count only when it is
non-zero.

Signed-off-by: Charan Teja Reddy <charante@xxxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Link: https://lkml.kernel.org/r/1618585336-37219-1-git-send-email-charante@xxxxxxxxxxxxxx
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
kernel/sched/psi.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 967732c0766c..651218ded981 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -711,14 +711,15 @@ static void psi_group_change(struct psi_group *group, int cpu,
for (t = 0, m = clear; m; m &= ~(1 << t), t++) {
if (!(m & (1 << t)))
continue;
- if (groupc->tasks[t] == 0 && !psi_bug) {
+ if (groupc->tasks[t]) {
+ groupc->tasks[t]--;
+ } else if (!psi_bug) {
printk_deferred(KERN_ERR "psi: task underflow! cpu=%d t=%d tasks=[%u %u %u %u] clear=%x set=%x\n",
cpu, t, groupc->tasks[0],
groupc->tasks[1], groupc->tasks[2],
groupc->tasks[3], clear, set);
psi_bug = 1;
}
- groupc->tasks[t]--;
}

for (t = 0; set; set &= ~(1 << t), t++)
--
2.30.2