On Tue 01-02-22 12:04:37, Waiman Long wrote:Using rcu_read_lock() is also what I have been thinking of doing. So I will update the patch to add that for safety.
On 2/1/22 05:54, Michal Hocko wrote:Right. And that means that cgtoup_name can go off the rail and wander
On Mon 31-01-22 14:23:07, Waiman Long wrote:The memcg is not going away as long as the page isn't freed unless if it is
It was found that a number of offlined memcgs were not freed becauseI have asked in the previous version already but what makes the memcg
they were pinned by some charged pages that were present. Even "echo
1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
offlined but not freed memcgs tend to increase in number over time with
the side effect that percpu memory consumption as shown in /proc/meminfo
also increases over time.
In order to find out more information about those pages that pin
offlined memcgs, the page_owner feature is extended to print memory
cgroup information especially whether the cgroup is offlined or not.
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
Acked-by: David Rientjes <rientjes@xxxxxxxxxx>
---
mm/page_owner.c | 39 +++++++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 28dac73e0542..a471c74c7fe0 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -10,6 +10,7 @@
#include <linux/migrate.h>
#include <linux/stackdepot.h>
#include <linux/seq_file.h>
+#include <linux/memcontrol.h>
#include <linux/sched/clock.h>
#include "internal.h"
@@ -325,6 +326,42 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
seq_putc(m, '\n');
}
+#ifdef CONFIG_MEMCG
+/*
+ * Looking for memcg information and print it out
+ */
+static inline void print_page_owner_memcg(char *kbuf, size_t count, int *pret,
+ struct page *page)
+{
+ unsigned long memcg_data = READ_ONCE(page->memcg_data);
+ struct mem_cgroup *memcg;
+ bool onlined;
+ char name[80];
+
+ if (!memcg_data)
+ return;
+
+ if (memcg_data & MEMCG_DATA_OBJCGS)
+ *pret += scnprintf(kbuf + *pret, count - *pret,
+ "Slab cache page\n");
+
+ memcg = page_memcg_check(page);
+ if (!memcg)
+ return;
+
+ onlined = (memcg->css.flags & CSS_ONLINE);
+ cgroup_name(memcg->css.cgroup, name, sizeof(name));
+ *pret += scnprintf(kbuf + *pret, count - *pret,
+ "Charged %sto %smemcg %s\n",
+ PageMemcgKmem(page) ? "(via objcg) " : "",
+ onlined ? "" : "offlined ",
+ name);
stable (why it cannot go away and be reallocated for something else)
while you are trying to get its name?
indirectly connected via objcg. Of course, there can be a race between the
page is going to be freed while the page_owner information is being
displayed.
through memory correct?
One solution is to add a simple bit lock to each of theI do not really see how a bit lock could prevent memcg from going away.
page_owner structure and acquire the lock when it is being written to or
read from.
On the other hand I think RCU read lock should be sufficient to keep the
memcg from going away completely.
Anyway a lot of these debugging aids or tools don't eliminate allI do not mind inaccurate information. That is natural but reading
the race conditions that affect the accuracy of the displayed information. I
can add a patch to eliminate this direct memcg race if you think this is
necessary.
through a freed memory can be really harmfull. So this really need to be
sorted out.