[rfc patch] sched/topology: fix domain reconstruction memory leakage
From: Mike Galbraith
Date: Sat Aug 19 2017 - 02:11:58 EST
Greetings,
While beating on cpu hotplug with the shiny new topology fixes
backported, my memory poor 8 socket box fairly quickly leaked itself to
death, 0c0e776a9b0f being the culprit. ÂWith the below applied, box
took a severe beating overnight without a whimper.
I'm wondering (ergo rfc) if free_sched_groups() shouldn't be renamed to
put_sched_groups() instead, with overlapping domains taking a group
reference reference as well so they can put both sg/sgc rather than put
one free the other. ÂThose places that want an explicit free can pass
free to only explicitly free sg (or use two functions). ÂMinimalist
approach works (minus signs, yay), but could perhaps use some "pretty".
sched/topology: fix domain reconstruction memory leakage
Since 0c0e776a9b0f, build_sched_groups() takes a reference on each
sg and sgc during domain generation, where previously it only took
a reference on the first group. Iterate groups and drop all added
references during domain destruction, otherwise CPU hotplug leaks.
Signed-off-by: Mike Galbraith <mgalbraith@xxxxxxx>
Fixes: 0c0e776a9b0f ("sched/topology: Rewrite get_group()")
---
kernel/sched/topology.c | 19 ++++++++-----------
1 file changed, 8 insertions(+), 11 deletions(-)
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -323,7 +323,7 @@ static struct root_domain *alloc_rootdom
return rd;
}
-static void free_sched_groups(struct sched_group *sg, int free_sgc)
+static void free_sched_groups(struct sched_group *sg, int put)
{
struct sched_group *tmp, *first;
@@ -334,10 +334,11 @@ static void free_sched_groups(struct sch
do {
tmp = sg->next;
- if (free_sgc && atomic_dec_and_test(&sg->sgc->ref))
+ if (put && atomic_dec_and_test(&sg->sgc->ref))
kfree(sg->sgc);
+ if (put < 2 || atomic_dec_and_test(&sg->ref))
+ kfree(sg);
- kfree(sg);
sg = tmp;
} while (sg != first);
}
@@ -345,15 +346,11 @@ static void free_sched_groups(struct sch
static void destroy_sched_domain(struct sched_domain *sd)
{
/*
- * If its an overlapping domain it has private groups, iterate and
- * nuke them all.
+ * If it's an overlapping domain it has private groups, iterate,
+ * freeing groups, otherwise dropping group references. In both
+ * cases, we must drop group capacity references.
*/
- if (sd->flags & SD_OVERLAP) {
- free_sched_groups(sd->groups, 1);
- } else if (atomic_dec_and_test(&sd->groups->ref)) {
- kfree(sd->groups->sgc);
- kfree(sd->groups);
- }
+ free_sched_groups(sd->groups, !(sd->flags & SD_OVERLAP)+1);
if (sd->shared && atomic_dec_and_test(&sd->shared->ref))
kfree(sd->shared);
kfree(sd);