On Sat, 20 Jun 2020 14:47:19 -0400 Waiman Long <longman@xxxxxxxxxx> wrote:
It was found that running the LTP test on a PowerPC system could produceThis is somewhat disturbing.
erroneous values in /proc/meminfo, like:
MemTotal: 531915072 kB
MemFree: 507962176 kB
MemAvailable: 1100020596352 kB
Using bisection, the problem is tracked down to commit 9c315e4d7d8c
("mm: memcg/slab: cache page number in memcg_(un)charge_slab()").
In memcg_uncharge_slab() with a "int order" argument:
unsigned int nr_pages = 1 << order;
:
mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages);
The mod_lruvec_state() function will eventually call the
__mod_zone_page_state() which accepts a long argument. Depending on
the compiler and how inlining is done, "-nr_pages" may be treated as
a negative number or a very large positive number. Apparently, it was
treated as a large positive number in that PowerPC system leading to
incorrect stat counts. This problem hasn't been seen in x86-64 yet,
perhaps the gcc compiler there has some slight difference in behavior.
It is fixed by making nr_pages a signed value. For consistency, a
similar change is applied to memcg_charge_slab() as well.
--- a/mm/slab.hI grabbed the patch, but Roman's "mm: memcg/slab: charge individual
+++ b/mm/slab.h
@@ -348,7 +348,7 @@ static __always_inline int memcg_charge_slab(struct page *page,
gfp_t gfp, int order,
struct kmem_cache *s)
{
- unsigned int nr_pages = 1 << order;
+ int nr_pages = 1 << order;
struct mem_cgroup *memcg;
struct lruvec *lruvec;
int ret;
@@ -388,7 +388,7 @@ static __always_inline int memcg_charge_slab(struct page *page,
static __always_inline void memcg_uncharge_slab(struct page *page, int order,
struct kmem_cache *s)
{
- unsigned int nr_pages = 1 << order;
+ int nr_pages = 1 << order;
struct mem_cgroup *memcg;
struct lruvec *lruvec;
slab objects instead of pages"
(http://lkml.kernel.org/r/20200608230654.828134-10-guro@xxxxxx) deletes
both these functions.
It replaces the offending code with, afaict,
static inline void memcg_slab_free_hook(struct kmem_cache *s, struct page *page,
void *p)
{
struct obj_cgroup *objcg;
unsigned int off;
if (!memcg_kmem_enabled() || is_root_cache(s))
return;
off = obj_to_index(s, page, p);
objcg = page_obj_cgroups(page)[off];
page_obj_cgroups(page)[off] = NULL;
obj_cgroup_uncharge(objcg, obj_full_size(s));
mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s),
obj_cgroup_put(objcg);-obj_full_size(s));
}
-obj_full_size() returns size_t so I guess that's OK.
Yes, we may need to review code that has sign extension to make sure such an error is not happening again.
Also
static __always_inline void uncharge_slab_page(struct page *page, int order,
struct kmem_cache *s)
{
#ifdef CONFIG_MEMCG_KMEM
if (memcg_kmem_enabled() && !is_root_cache(s)) {
memcg_free_page_obj_cgroups(page);
percpu_ref_put_many(&s->memcg_params.refcnt, 1 << order);
}
#endif
mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
}-(PAGE_SIZE << order));
PAGE_SIZE is unsigned long so I guess that's OK as well.
Still, perhaps both could be improved. Negating an unsigned scalar is
a pretty ugly thing to do.
Am I wrong in thinking that all those mod_foo() functions need careful
review?