Re: [PATCH] mm, slab: Fix sign conversion problem in memcg_uncharge_slab()

From: Waiman Long
Date: Sat Jun 20 2020 - 16:48:13 EST


On 6/20/20 3:59 PM, Roman Gushchin wrote:
On Sat, Jun 20, 2020 at 02:47:19PM -0400, Waiman Long wrote:
It was found that running the LTP test on a PowerPC system could produce
erroneous values in /proc/meminfo, like:

MemTotal: 531915072 kB
MemFree: 507962176 kB
MemAvailable: 1100020596352 kB

Using bisection, the problem is tracked down to commit 9c315e4d7d8c
("mm: memcg/slab: cache page number in memcg_(un)charge_slab()").

In memcg_uncharge_slab() with a "int order" argument:

unsigned int nr_pages = 1 << order;
:
mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages);

The mod_lruvec_state() function will eventually call the
__mod_zone_page_state() which accepts a long argument. Depending on
the compiler and how inlining is done, "-nr_pages" may be treated as
a negative number or a very large positive number. Apparently, it was
treated as a large positive number in that PowerPC system leading to
incorrect stat counts. This problem hasn't been seen in x86-64 yet,
perhaps the gcc compiler there has some slight difference in behavior.

It is fixed by making nr_pages a signed value. For consistency, a
similar change is applied to memcg_charge_slab() as well.

Fixes: 9c315e4d7d8c ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()").
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
Good catch!

Interesting that I haven't seen it on x86-64, but it's reproducible on Power.

Acked-by: Roman Gushchin <guro@xxxxxx>

I think it is probably related to the level of inlining that are being done. Besides, the interpretation of -nr_pages is ambiguous if nr_pages is an unsigned value and different compilers may handle it differently when sign extension is needed.

Cheers,
Longman