Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK
From: Rik van Riel
Date: Sat Aug 05 2017 - 10:06:28 EST
On Fri, 2017-08-04 at 16:09 -0700, Mike Kravetz wrote:
> On 08/04/2017 12:07 PM, riel@xxxxxxxxxx wrote:
> > From: Rik van Riel <riel@xxxxxxxxxx>
> >
> > Introduce MADV_WIPEONFORK semantics, which result in a VMA being
> > empty in the child process after fork. This differs from
> > MADV_DONTFORK
> > in one important way.
> >
> > If a child process accesses memory that was MADV_WIPEONFORK, it
> > will get zeroes. The address ranges are still valid, they are just
> > empty.
> >
> This didn't seem 'quite right' to me for shared mappings and/or file
> backed mappings.ÂÂI wasn't exactly sure what it 'should' do in such
> cases.ÂÂSo, I tried it with a mapping created as follows:
>
> addr = mmap(ADDR, page_size,
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂPROT_READ | PROT_WRITE,
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂMAP_ANONYMOUS|MAP_SHARED, -1, 0);
Your test program is pretty much the same I used, except I
used MAP_PRIVATE instead of MAP_SHARED.
Let me see how the code paths differ for both cases...
> When setting MADV_WIPEONFORK on the vma/mapping, I got the following
> at task exit time:
>
> [ÂÂ694.558290] ------------[ cut here ]------------
> [ÂÂ694.558978] kernel BUG at mm/filemap.c:212!
> [ÂÂ694.559476] invalid opcode: 0000 [#1] SMP
> [ÂÂ694.560023] Modules linked in: ip6t_REJECT nf_reject_ipv6
> ip6t_rpfilter xt_conntrack ebtable_broute bridge stp llc ebtable_nat
> ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
> ip6table_raw ip6table_mangle ip6table_security iptable_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
> iptable_raw iptable_mangle 9p iptable_security ebtable_filter
> ebtables ip6table_filter ip6_tables snd_hda_codec_generic
> snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq ppdev
> snd_seq_device joydev crct10dif_pclmul crc32_pclmul crc32c_intel
> snd_pcm ghash_clmulni_intel 9pnet_virtio virtio_balloon snd_timer
> 9pnet parport_pc snd parport i2c_piix4 soundcore nfsd auth_rpcgss
> nfs_acl lockd grace sunrpc virtio_net virtio_blk virtio_console
> 8139too qxl drm_kms_helper ttm drm serio_raw 8139cp
> [ÂÂ694.571554]ÂÂmii virtio_pci ata_generic virtio_ring virtio
> pata_acpi
> [ÂÂ694.572608] CPU: 3 PID: 1200 Comm: test_wipe2 Not tainted 4.13.0-
> rc3+ #8
> [ÂÂ694.573778] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.9.1-1.fc24 04/01/2014
> [ÂÂ694.574917] task: ffff880137178040 task.stack: ffffc900019d4000
> [ÂÂ694.575650] RIP: 0010:__delete_from_page_cache+0x344/0x410
> [ÂÂ694.576409] RSP: 0018:ffffc900019d7a88 EFLAGS: 00010082
> [ÂÂ694.577238] RAX: 0000000000000021 RBX: ffffea00047d0e00 RCX:
> 0000000000000006
> [ÂÂ694.578537] RDX: 0000000000000000 RSI: 0000000000000096 RDI:
> ffff88023fd0db90
> [ÂÂ694.579774] RBP: ffffc900019d7ad8 R08: 00000000000882b6 R09:
> 000000000000028a
> [ÂÂ694.580754] R10: ffffc900019d7da8 R11: ffffffff8211184d R12:
> ffffea00047d0e00
> [ÂÂ694.582040] R13: 0000000000000000 R14: 0000000000000202 R15:
> ffff8801384439e8
> [ÂÂ694.583236] FS:ÂÂ0000000000000000(0000) GS:ffff88023fd00000(0000)
> knlGS:0000000000000000
> [ÂÂ694.584607] CS:ÂÂ0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ÂÂ694.585409] CR2: 00007ff77a8da618 CR3: 0000000001e09000 CR4:
> 00000000001406e0
> [ÂÂ694.586547] Call Trace:
> [ÂÂ694.586996]ÂÂdelete_from_page_cache+0x54/0x110
> [ÂÂ694.587481]ÂÂtruncate_inode_page+0xab/0x120
> [ÂÂ694.588110]ÂÂshmem_undo_range+0x498/0xa50
> [ÂÂ694.588813]ÂÂ? save_stack_trace+0x1b/0x20
> [ÂÂ694.589529]ÂÂ? set_track+0x70/0x140
> [ÂÂ694.590150]ÂÂ? init_object+0x69/0xa0
> [ÂÂ694.590722]ÂÂ? __inode_wait_for_writeback+0x73/0xe0
> [ÂÂ694.591525]ÂÂshmem_truncate_range+0x16/0x40
> [ÂÂ694.592268]ÂÂshmem_evict_inode+0xb1/0x190
> [ÂÂ694.592735]ÂÂevict+0xbb/0x1c0
> [ÂÂ694.593147]ÂÂiput+0x1c0/0x210
> [ÂÂ694.593497]ÂÂdentry_unlink_inode+0xb4/0x150
> [ÂÂ694.593982]ÂÂ__dentry_kill+0xc1/0x150
> [ÂÂ694.594400]ÂÂdput+0x1c8/0x1e0
> [ÂÂ694.594745]ÂÂ__fput+0x172/0x1e0
> [ÂÂ694.595103]ÂÂ____fput+0xe/0x10
> [ÂÂ694.595463]ÂÂtask_work_run+0x80/0xa0
> [ÂÂ694.595886]ÂÂdo_exit+0x2d6/0xb50
> [ÂÂ694.596323]ÂÂ? __do_page_fault+0x288/0x4a0
> [ÂÂ694.596818]ÂÂdo_group_exit+0x47/0xb0
> [ÂÂ694.597249]ÂÂSyS_exit_group+0x14/0x20
> [ÂÂ694.597682]ÂÂentry_SYSCALL_64_fastpath+0x1a/0xa5
> [ÂÂ694.598198] RIP: 0033:0x7ff77a5e78c8
> [ÂÂ694.598612] RSP: 002b:00007ffc5aece318 EFLAGS: 00000246 ORIG_RAX:
> 00000000000000e7
> [ÂÂ694.599804] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
> 00007ff77a5e78c8
> [ÂÂ694.600609] RDX: 0000000000000000 RSI: 000000000000003c RDI:
> 0000000000000000
> [ÂÂ694.601424] RBP: 00007ff77a8da618 R08: 00000000000000e7 R09:
> ffffffffffffff98
> [ÂÂ694.602224] R10: 0000000000000003 R11: 0000000000000246 R12:
> 0000000000000001
> [ÂÂ694.603151] R13: 00007ff77a8dbc60 R14: 0000000000000000 R15:
> 0000000000000000
> [ÂÂ694.603984] Code: 60 f3 c5 81 e8 2e 7e 03 00 0f 0b 48 c7 c6 60 f3
> c5 81 4c 89 e7 e8 1d 7e 03 00 0f 0b 48 c7 c6 00 f4 c5 81 4c 89 e7 e8
> 0c 7e 03 00 <0f> 0b 48 c7 c6 38 f3 c5 81 4c 89 e7 e8 fb 7d 03 00 0f
> 0b 48 c7Â
> [ÂÂ694.606500] RIP: __delete_from_page_cache+0x344/0x410 RSP:
> ffffc900019d7a88
> [ÂÂ694.607426] ---[ end trace 55e6b04ae95d8ce3 ]---
>
> BTW, this was on 4.13.0-rc3 + your patches.ÂÂSimple test program is
> below.
>