Re: Kmemleak infrastructure improvement for task_struct leaks and call_rcu()
From: Catalin Marinas
Date: Thu May 07 2020 - 13:14:24 EST
On Wed, May 06, 2020 at 10:40:19AM -0700, Paul E. McKenney wrote:
> On Wed, May 06, 2020 at 12:22:37PM -0400, Qian Cai wrote:
> > == call_rcu() leaks ==
> > Another issue that might be relevant is that it seems sometimes,
> > kmemleak will give a lot of false positives (hundreds) because the
> > memory was supposed to be freed by call_rcu() (for example, in
> > dst_release()) but for some reasons, it takes a long time probably
> > waiting for grace periods or some kind of RCU self-stall, but the
> > memory had already became an orphan. I am not sure how we are going
> > to resolve this properly until we have to figure out why call_rcu()
> > is taking so long to finish?
>
> I know nothing about kmemleak, but I won't let that stop me from making
> random suggestions...
>
> One approach is to do an rcu_barrier() inside kmemleak just before
> printing leaked blocks, and check to see if any are still leaked after
> the rcu_barrier().
The main issue is that kmemleak doesn't stop the world when scanning
(which can take over a minute, depending on your hardware), so we get
lots of transient pointer misses. There are some heuristics but
obviously they don't always work.
With RCU, objects are queued for RCU freeing later and chained via
rcu_head.next (IIUC). Under load, this list can be pretty volatile and
if this happen during kmemleak scanning, it's sufficient to lose track
of a next pointer and the rest of the list would be reported as a leak.
I think rcu_barrier() just before the starting the kmemleak scanning may
help if it reduces the number of objects queued.
Now, I wonder whether kmemleak itself can break this RCU chain. The
kmemleak metadata is allocated on a slab alloc callback. The freeing,
however, is done using call_rcu() because originally calling back into
the slab freeing from kmemleak_free() didn't go well. Since the
kmemleak_object structure is not tracked by kmemleak, I wonder whether
its rcu_head would break this directed pointer reference graph.
Let's try the rcu_barrier() first and I'll think about the metadata case
over the weekend.
Thanks.
--
Catalin