Re: [PATCH v4 00/16] Overhaul multi-page lookups for THP
From: Hugh Dickins
Date: Thu Dec 03 2020 - 17:20:29 EST
On Thu, 3 Dec 2020, Qian Cai wrote:
> On Thu, 2020-12-03 at 18:27 +0100, Marek Szyprowski wrote:
> > On 03.12.2020 16:46, Marek Szyprowski wrote:
> > > On 25.11.2020 03:32, Matthew Wilcox wrote:
> > > > On Tue, Nov 17, 2020 at 11:43:02PM +0000, Matthew Wilcox wrote:
> > > > > On Tue, Nov 17, 2020 at 07:15:13PM +0000, Matthew Wilcox wrote:
> > > > > > I find both of these functions exceptionally confusing. Does this
> > > > > > make it easier to understand?
> > > > > Never mind, this is buggy. I'll send something better tomorrow.
> > > > That took a week, not a day. *sigh*. At least this is shorter.
> > > >
> > > > commit 1a02863ce04fd325922d6c3db6d01e18d55f966b
> > > > Author: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> > > > Date: Tue Nov 17 10:45:18 2020 -0500
> > > >
> > > > fix mm-truncateshmem-handle-truncates-that-split-thps.patch
> > >
> > > This patch landed in todays linux-next (20201203) as commit
> > > 8678b27f4b8b ("8678b27f4b8bfc130a13eb9e9f27171bcd8c0b3b"). Sadly it
> > > breaks booting of ANY of my ARM 32bit test systems, which use initrd.
> > > ARM64bit based systems boot fine. Here is example of the crash:
> >
> > One more thing. Reverting those two:
> >
> > 1b1aa968b0b6 mm-truncateshmem-handle-truncates-that-split-thps-fix-fix
> >
> > 8678b27f4b8b mm-truncateshmem-handle-truncates-that-split-thps-fix
> >
> > on top of linux next-20201203 fixes the boot issues.
>
> We have to revert those two patches as well to fix this one process keeps
> running 100% CPU in find_get_entries() and all other threads are blocking on the
> i_mutex almost forever.
>
> [ 380.735099] INFO: task trinity-c58:2143 can't die for more than 125 seconds.
> [ 380.742923] task:trinity-c58 state:R running task stack:26056 pid: 2143 ppid: 1914 flags:0x00004006
> [ 380.753640] Call Trace:
> [ 380.756811] ? find_get_entries+0x339/0x790
> find_get_entry at mm/filemap.c:1848
> (inlined by) find_get_entries at mm/filemap.c:1904
> [ 380.761723] ? __lock_page_or_retry+0x3f0/0x3f0
> [ 380.767009] ? shmem_undo_range+0x3bf/0xb60
> [ 380.771944] ? unmap_mapping_pages+0x96/0x230
> [ 380.777036] ? find_held_lock+0x33/0x1c0
> [ 380.781688] ? shmem_write_begin+0x1b0/0x1b0
> [ 380.786703] ? unmap_mapping_pages+0xc2/0x230
> [ 380.791796] ? down_write+0xe0/0x150
> [ 380.796114] ? do_wp_page+0xc60/0xc60
> [ 380.800507] ? shmem_truncate_range+0x14/0x80
> [ 380.805618] ? shmem_setattr+0x827/0xc70
> [ 380.810274] ? notify_change+0x6cf/0xc30
> [ 380.814941] ? do_truncate+0xe2/0x180
> [ 380.819335] ? do_truncate+0xe2/0x180
> [ 380.823741] ? do_sys_openat2+0x5c0/0x5c0
> [ 380.828484] ? do_sys_ftruncate+0x2e2/0x4e0
> [ 380.833417] ? trace_hardirqs_on+0x1c/0x150
> [ 380.838335] ? do_syscall_64+0x33/0x40
> [ 380.842828] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Thanks for trinitizing. If you have time, please would you try
replacing the shmem_undo_range() in mm/shmem.c by the version I gave in
https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2012031305070.12944@eggly.anvils/T/#mc15d60a2166f80fe284a18d4758eb4c04cc3255d
That will not help at all with the 32-bit booting issue,
but it does have a good chance of placating trinity.
Thanks,
Hugh