Re: [PATCHv6 00/36] THP refcounting redesign

From: Jerome Marchand
Date: Tue Jun 16 2015 - 09:17:39 EST


On 06/03/2015 07:05 PM, Kirill A. Shutemov wrote:
> Hello everybody,
>
> Here's new revision of refcounting patchset. Please review and consider
> applying.
>
> The goal of patchset is to make refcounting on THP pages cheaper with
> simpler semantics and allow the same THP compound page to be mapped with
> PMD and PTEs. This is required to get reasonable THP-pagecache
> implementation.
>
> With the new refcounting design it's much easier to protect against
> split_huge_page(): simple reference on a page will make you the deal.
> It makes gup_fast() implementation simpler and doesn't require
> special-case in futex code to handle tail THP pages.
>
> It should improve THP utilization over the system since splitting THP in
> one process doesn't necessary lead to splitting the page in all other
> processes have the page mapped.
>
> The patchset drastically lower complexity of get_page()/put_page()
> codepaths. I encourage people look on this code before-and-after to
> justify time budget on reviewing this patchset.
>
> = Changelog =
>
> v6:
> - rebase to since-4.0;
> - optimize mapcount handling: significantely reduce overhead for most
> common cases.
> - split pages on migrate_pages();
> - remove infrastructure for handling splitting PMDs on all architectures;
> - fix page_mapcount() for hugetlb pages;
>

Hi Kirill,

I ran some LTP mm tests and hugemmap tests trigger the following:

[ 438.749457] page:ffffea0000df8000 count:2 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0
[ 438.750089] flags: 0x3ffc0000004001(locked|head)
[ 438.750089] page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
[ 438.750089] ------------[ cut here ]------------
[ 438.768046] kernel BUG at mm/filemap.c:205!
[ 438.768046] invalid opcode: 0000 [#1] SMP
[ 438.768046] Modules linked in: loop ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ppdev iosf_mbi crct10dif_pclmul crc32_pclmul crc32c_intel joydev ghash_clmulni_intel virtio_balloon pcspkr virtio_console nfsd parport_pc parport floppy pvpanic i2c_piix4 acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc virtio_net qxl virtio_blk drm_kms_helper ttm drm serio_raw ata_generic virtio_pci virtio_ring virtio pata_acpi
[ 438.768046] CPU: 1 PID: 12918 Comm: hugemmap01 Not tainted 4.0.0thprfc-kasv6+ #247
[ 438.768046] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 438.768046] task: ffff88007b09cc40 ti: ffff880077b88000 task.ti: ffff880077b88000
[ 438.768046] RIP: 0010:[<ffffffff811e2aac>] [<ffffffff811e2aac>] __delete_from_page_cache+0x4bc/0x5a0
[ 438.768046] RSP: 0018:ffff880077b8bc58 EFLAGS: 00010086
[ 438.768046] RAX: 0000000000000036 RBX: ffffea0000df8000 RCX: 0000000000000006
[ 438.768046] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88007d5ce9c0
[ 438.768046] RBP: ffff880077b8bcb8 R08: 0000000000000001 R09: 0000000000000001
[ 438.768046] R10: 0000000000000001 R11: ffff880034e44210 R12: ffffea0000df8000
[ 438.768046] R13: ffff88003562cac0 R14: 0000000000000000 R15: ffff88003562cac8
[ 438.768046] FS: 00007fda9ccbb700(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000
[ 438.768046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 438.768046] CR2: 00007fda9ccc7000 CR3: 00000000785e6000 CR4: 00000000001407e0
[ 438.768046] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 438.768046] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 438.768046] Stack:
[ 438.768046] 0000000000000246 ffff88003562cad8 ffff88003562caf0 0000000000000000
[ 438.768046] ffff88003562cad0 000000009bfc6d69 ffff880077b8bcb8 ffffea0000df8000
[ 438.768046] ffff88003562cad8 0000000000000000 ffffea0000df8000 0000000000000000
[ 438.768046] Call Trace:
[ 438.768046] [<ffffffff811e2be5>] delete_from_page_cache+0x55/0xd0
[ 438.768046] [<ffffffff81380be5>] truncate_hugepages+0x135/0x290
[ 438.768046] [<ffffffff810e7df5>] ? local_clock+0x15/0x30
[ 438.768046] [<ffffffff8110647f>] ? lock_release_holdtime.part.31+0xf/0x190
[ 438.768046] [<ffffffff81380eb8>] hugetlbfs_evict_inode+0x18/0x40
[ 438.768046] [<ffffffff812982bb>] evict+0xab/0x180
[ 438.768046] [<ffffffff81298cee>] iput+0x1ce/0x390
[ 438.768046] [<ffffffff8128aba9>] do_unlinkat+0x209/0x330
[ 438.768046] [<ffffffff81884632>] ? ret_from_sys_call+0x24/0x5f
[ 438.768046] [<ffffffff811095ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 438.768046] [<ffffffff8128bf66>] SyS_unlink+0x16/0x20
[ 438.768046] [<ffffffff81884609>] system_call_fastpath+0x12/0x17
[ 438.768046] Code: 49 8b 14 24 4c 89 e0 80 e6 80 74 08 4c 89 e7 e8 15 2e 69 00 8b 40 48 83 c0 01 74 25 48 c7 c6 28 fb c6 81 48 89 df e8 d4 43 03 00 <0f> 0b 48 89 df e8 f4 2d 69 00 48 f7 00 00 c0 00 00 49 89 c4 75
[ 438.768046] RIP [<ffffffff811e2aac>] __delete_from_page_cache+0x4bc/0x5a0
[ 438.768046] RSP <ffff880077b8bc58>
[ 438.768046] ---[ end trace 3903188dcb3f3d48 ]---
[ 438.768046] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41
[ 438.768046] in_atomic(): 1, irqs_disabled(): 1, pid: 12918, name: hugemmap01
[ 438.768046] INFO: lockdep is turned off.
[ 438.768046] irq event stamp: 6218
[ 438.768046] hardirqs last enabled at (6217): [<ffffffff818812df>] __mutex_unlock_slowpath+0xbf/0x190
[ 438.768046] hardirqs last disabled at (6218): [<ffffffff8188387f>] _raw_spin_lock_irq+0x1f/0x80
[ 438.768046] softirqs last enabled at (6042): [<ffffffff810b0df7>] __do_softirq+0x377/0x670
[ 438.768046] softirqs last disabled at (6027): [<ffffffff810b14ad>] irq_exit+0x11d/0x130
[ 438.768046] CPU: 1 PID: 12918 Comm: hugemmap01 Tainted: G D 4.0.0thprfc-kasv6+ #247
[ 438.768046] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 438.768046] 0000000000000000 000000009bfc6d69 ffff880077b8b8a8 ffffffff81879afa
[ 438.768046] 0000000000000000 ffff88007b09cc40 ffff880077b8b8d8 ffffffff810da0cc
[ 438.768046] 0000000000000000 ffffffff81c68746 0000000000000029 0000000000000000
[ 438.768046] Call Trace:
[ 438.768046] [<ffffffff81879afa>] dump_stack+0x4c/0x65
[ 438.768046] [<ffffffff810da0cc>] ___might_sleep+0x18c/0x250
[ 438.768046] [<ffffffff810da1dd>] __might_sleep+0x4d/0x90
[ 438.768046] [<ffffffff8188163a>] down_read+0x2a/0xa0
[ 438.768046] [<ffffffff810be6c3>] exit_signals+0x33/0x150
[ 438.768046] [<ffffffff810adc2f>] do_exit+0xcf/0xd20
[ 438.768046] [<ffffffff81121006>] ? kmsg_dump+0x166/0x220
[ 438.768046] [<ffffffff81120ed4>] ? kmsg_dump+0x34/0x220
[ 438.768046] [<ffffffff81021cce>] oops_end+0x9e/0xe0
[ 438.768046] [<ffffffff8102224b>] die+0x4b/0x70
[ 438.768046] [<ffffffff8101df80>] do_trap+0xb0/0x150
[ 438.768046] [<ffffffff8101e2f4>] do_error_trap+0xa4/0x180
[ 438.768046] [<ffffffff811e2aac>] ? __delete_from_page_cache+0x4bc/0x5a0
[ 438.768046] [<ffffffff81120255>] ? vprintk_emit+0x285/0x620
[ 438.768046] [<ffffffff81435b9d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[ 438.768046] [<ffffffff8101ee90>] do_invalid_op+0x20/0x30
[ 438.768046] [<ffffffff818860de>] invalid_op+0x1e/0x30
[ 438.768046] [<ffffffff811e2aac>] ? __delete_from_page_cache+0x4bc/0x5a0
[ 438.768046] [<ffffffff811e2aac>] ? __delete_from_page_cache+0x4bc/0x5a0
[ 438.768046] [<ffffffff811e2be5>] delete_from_page_cache+0x55/0xd0
[ 438.768046] [<ffffffff81380be5>] truncate_hugepages+0x135/0x290
[ 438.768046] [<ffffffff810e7df5>] ? local_clock+0x15/0x30
[ 438.768046] [<ffffffff8110647f>] ? lock_release_holdtime.part.31+0xf/0x190
[ 438.768046] [<ffffffff81380eb8>] hugetlbfs_evict_inode+0x18/0x40
[ 438.768046] [<ffffffff812982bb>] evict+0xab/0x180
[ 438.768046] [<ffffffff81298cee>] iput+0x1ce/0x390
[ 438.768046] [<ffffffff8128aba9>] do_unlinkat+0x209/0x330
[ 438.768046] [<ffffffff81884632>] ? ret_from_sys_call+0x24/0x5f
[ 438.768046] [<ffffffff811095ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[ 438.768046] [<ffffffff8128bf66>] SyS_unlink+0x16/0x20
[ 438.768046] [<ffffffff81884609>] system_call_fastpath+0x12/0x17
[ 438.768046] note: hugemmap01[12918] exited with preempt_count 1

Jerome

Attachment: signature.asc
Description: OpenPGP digital signature