Re: 4.1.28: memory leak introduced by "mm/swap.c: flush lru pvecs on compound page arrival"

From: Michal Hocko
Date: Mon Jul 18 2016 - 02:53:21 EST


On Sat 16-07-16 23:47:40, Minchan Kim wrote:
> On Fri, Jul 15, 2016 at 09:27:55PM +0200, Jens Rottmann wrote:
> > Hi,
> >
> > 4.1.y stable commit c5ad33184354260be6d05de57e46a5498692f6d6 (Upstream
> > commit 8f182270dfec432e93fae14f9208a6b9af01009f) "mm/swap.c: flush lru
> > pvecs on compound page arrival" in 4.1.28 introduces a memory leak.
> >
> > Simply running
> >
> > while sleep 0.1; do clear; free; done
> >
> > shows mem continuously going down, eventually system panics with no
> > killable processes left. Using "unxz -t some.xz" instead of sleep brings
> > system down within minutes.
> >
> > Kmemleak did not report anything. Bisect ended at named commit, and
> > reverting only this commit is indeed sufficient to fix the leak. Swap
> > partition on/off makes no difference.
> >
> > My set-up:
> > i.MX6 (ARM Cortex-A9) dual-core, 2 GB RAM. Kernel sources are from
> > git.freescale.com i.e. heavily modified by Freescale for i.MX SoCs,
> > kernel.org stable patches up to 4.1.28 manually added.
> >
> > I tried to reproduce with vanilla 4.1.28, but that wouldn't boot at all
> > on my hardware, hangs immediately after "Starting kernel", sorry.
> > However there is not a single difference between Freescale and vanilla
> > in the whole mm/ subdirectory, so I don't think it's i.MX-specific. I
> > didn't cross-check with an x86 system (yet).
>
> I didn't have 4.1 stable tree in my local so just looked at git web
> and found __lru_cache_add has a bug.
>
> Please change
>
> static void __lru_cache_add(struct page *page)
> {
> struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
>
> page_cache_get(page);
> if (!pagevec_space(pvec) || PageCompound(page)) <==
> __pagevec_lru_add(pvec);
> put_cpu_var(lru_add_pvec);
> }
>
> with
>
> static void __lru_cache_add(struct page *page)
> {
> struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
>
> page_cache_get(page);
> if (!pagevec_add(pvec, page) || PageCompound(page)) <==
> __pagevec_lru_add(pvec);
> put_cpu_var(lru_add_pvec);
> }
>

Yes this is it. Steven has reported that last week and Sasha should be
aware of that http://lkml.kernel.org/r/20160714175521.3675e3d6@xxxxxxxxxxxxxxxxxx

--
Michal Hocko
SUSE Labs