Re: [PATCH] mm: fix trying to reclaim unevicable LRU page

From: Minchan Kim
Date: Mon Jun 10 2019 - 07:25:42 EST


On Mon, Jun 10, 2019 at 12:39:46PM +0200, Michal Hocko wrote:
> On Mon 10-06-19 18:42:22, Minchan Kim wrote:
> > On Tue, Jun 04, 2019 at 02:28:06PM +0200, Michal Hocko wrote:
> > > On Thu 30-05-19 11:42:29, Minchan Kim wrote:
> > > > On Tue, May 28, 2019 at 05:14:07PM +0200, Michal Hocko wrote:
> > > > > [Cc Pankaj Suryawanshi who has reported a similar problem
> > > > > http://lkml.kernel.org/r/SG2PR02MB309806967AE91179CAFEC34BE84B0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]
> > > > >
> > > > > On Fri 24-05-19 16:11:14, Minchan Kim wrote:
> > > > > > There was below bugreport from Wu Fangsuo.
> > > > > >
> > > > > > 7200 [ 680.491097] c4 7125 (syz-executor) page:ffffffbf02f33b40 count:86 mapcount:84 mapping:ffffffc08fa7a810 index:0x24
> > > > > > 7201 [ 680.531186] c4 7125 (syz-executor) flags: 0x19040c(referenced|uptodate|arch_1|mappedtodisk|unevictable|mlocked)
> > > > > > 7202 [ 680.544987] c0 7125 (syz-executor) raw: 000000000019040c ffffffc08fa7a810 0000000000000024 0000005600000053
> > > > > > 7203 [ 680.556162] c0 7125 (syz-executor) raw: ffffffc009b05b20 ffffffc009b05b20 0000000000000000 ffffffc09bf3ee80
> > > > > > 7204 [ 680.566860] c0 7125 (syz-executor) page dumped because: VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page))
> > > > > > 7205 [ 680.578038] c0 7125 (syz-executor) page->mem_cgroup:ffffffc09bf3ee80
> > > > > > 7206 [ 680.585467] c0 7125 (syz-executor) ------------[ cut here ]------------
> > > > > > 7207 [ 680.592466] c0 7125 (syz-executor) kernel BUG at /home/build/farmland/adroid9.0/kernel/linux/mm/vmscan.c:1350!
> > > > > > 7223 [ 680.603663] c0 7125 (syz-executor) Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> > > > > > 7224 [ 680.611436] c0 7125 (syz-executor) Modules linked in:
> > > > > > 7225 [ 680.616769] c0 7125 (syz-executor) CPU: 0 PID: 7125 Comm: syz-executor Tainted: G S 4.14.81 #3
> > > > > > 7226 [ 680.626826] c0 7125 (syz-executor) Hardware name: ASR AQUILAC EVB (DT)
> > > > > > 7227 [ 680.633623] c0 7125 (syz-executor) task: ffffffc00a54cd00 task.stack: ffffffc009b00000
> > > > > > 7228 [ 680.641917] c0 7125 (syz-executor) PC is at shrink_page_list+0x1998/0x3240
> > > > > > 7229 [ 680.649144] c0 7125 (syz-executor) LR is at shrink_page_list+0x1998/0x3240
> > > > > > 7230 [ 680.656303] c0 7125 (syz-executor) pc : [<ffffff90083a2158>] lr : [<ffffff90083a2158>] pstate: 60400045
> > > > > > 7231 [ 680.666086] c0 7125 (syz-executor) sp : ffffffc009b05940
> > > > > > ..
> > > > > > 7342 [ 681.671308] c0 7125 (syz-executor) [<ffffff90083a2158>] shrink_page_list+0x1998/0x3240
> > > > > > 7343 [ 681.679567] c0 7125 (syz-executor) [<ffffff90083a3dc0>] reclaim_clean_pages_from_list+0x3c0/0x4f0
> > > > > > 7344 [ 681.688793] c0 7125 (syz-executor) [<ffffff900837ed64>] alloc_contig_range+0x3bc/0x650
> > > > > > 7347 [ 681.717421] c0 7125 (syz-executor) [<ffffff90084925cc>] cma_alloc+0x214/0x668
> > > > > > 7348 [ 681.724892] c0 7125 (syz-executor) [<ffffff90091e4d78>] ion_cma_allocate+0x98/0x1d8
> > > > > > 7349 [ 681.732872] c0 7125 (syz-executor) [<ffffff90091e0b20>] ion_alloc+0x200/0x7e0
> > > > > > 7350 [ 681.740302] c0 7125 (syz-executor) [<ffffff90091e154c>] ion_ioctl+0x18c/0x378
> > > > > > 7351 [ 681.747738] c0 7125 (syz-executor) [<ffffff90084c6824>] do_vfs_ioctl+0x17c/0x1780
> > > > > > 7352 [ 681.755514] c0 7125 (syz-executor) [<ffffff90084c7ed4>] SyS_ioctl+0xac/0xc0
> > > > > >
> > > > > > Wu found it's due to [1]. Before that, unevictable page goes to cull_mlocked
> > > > > > routine so that it couldn't reach the VM_BUG_ON_PAGE line.
> > > > > >
> > > > > > To fix the issue, this patch filter out unevictable LRU pages
> > > > > > from the reclaim_clean_pages_from_list in CMA.
> > > > >
> > > > > The changelog is rather modest on details and I have to confess I have
> > > > > little bit hard time to understand it. E.g. why do not we need to handle
> > > > > the regular reclaim path?
> > > >
> > > > No need to pass unevictable pages into regular reclaim patch if we are
> > > > able to know in advance.
> > >
> > > I am sorry to be dense here. So what is the difference in the CMA path?
> > > Am I right that the pfn walk (CMA) rather than LRU isolation (reclaim)
> > > is the key differentiator?
> >
> > Yes.
> > We could isolate unevictable LRU pages from the pfn waker to migrate and
> > could discard clean file-backed pages to reduce migration latency in CMA
> > path.
>
> Please be explicit about that in the changelog. The fact that this is
> not possible from the regular reclaim path is really important and not
> obvious from the first glance.
>
> Thanks!

Here it goes:

Andrew, could you replace the patch with the below if Michal doesn't have
any suggestion?

Thanks.