Re: [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting
From: Vladimir Davydov
Date: Mon Sep 19 2016 - 11:35:57 EST
On Wed, Sep 14, 2016 at 03:48:44PM -0400, Johannes Weiner wrote:
> From: Johannes Weiner <jweiner@xxxxxx>
>
> During cgroup2 rollout into production, we started encountering css
> refcount underflows and css access crashes in the memory controller.
> Splitting the heavily shared css reference counter into logical users
> narrowed the imbalance down to the cgroup2 socket memory accounting.
>
> The problem turns out to be the per-cpu charge cache. Cgroup1 had a
> separate socket counter, but the new cgroup2 socket accounting goes
> through the common charge path that uses a shared per-cpu cache for
> all memory that is being tracked. Those caches are safe against
> scheduling preemption, but not against interrupts - such as the newly
> added packet receive path. When cache draining is interrupted by
> network RX taking pages out of the cache, the resuming drain operation
> will put references of in-use pages, thus causing the imbalance.
>
> Disable IRQs during all per-cpu charge cache operations.
>
> Fixes: f7e1cb6ec51b ("mm: memcontrol: account socket memory in unified hierarchy memory controller")
> Cc: <stable@xxxxxxxxxxxxxxx> # 4.5+
> Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Acked-by: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>