Re: [RFC PATCH] proc_sysctl: free invalidate proc_sys_dentry

From: Xishi Qiu
Date: Sun Aug 21 2016 - 22:18:21 EST


On 2016/8/11 15:11, Xishi Qiu wrote:

> From: Fengtiantian <fengtiantian@xxxxxxxxxx>
>

ping

> I find a issue in dentry cache used by sysctl proc.
> If register sysctl proc file ,access the file and then unregister this file, dentry in cache will keep increasing, and cause CPU softlockupã
>
> I test in the kernel 3.10.0-327.
>
> My testcase is :
> #/bin/sh
> while :
> do
> brctl addbr abc
> cat /proc/sys/net/ipv6/conf/abc/autoconf
> brctl delbr abc
> done
>
> run this script , see the dentry in slabinfo keep increasing:
> cat /proc/slabinfo | grep den
> dentry 106624 187026 192 42 2 : tunables 0 0 0 : slabdata 4453 4453 0
>
> And because the dentry path is same, their dentry name hash is same, so all dentry will link in one hash list. The function __d_lookup time cost will increase too.
> In the situation, if run anther script:
>
> #/bin/sh
> touch testfile1
> while :
> do
> mv testfile1 testfile2
> mv testfile2 testfile1
> done
>
> The CPU softlocup happen:
> [45029.115429] BUG: soft lockup - CPU#10 stuck for 22s! [cat:18953]
> [45029.121607] Modules linked in: bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter openvswitch(OE) nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack gre libcrc32c kboxdriver(O) kbox(O) ipmi_devintf ipmi_si ipmi_msghandler intel_powerclamp coretemp intel_rapl crc32_pclmul ghash_clmulni_intel aesni_intel sr_mod cdrom iTCO_wdt iTCO_vendor_support lrw gf128mul mei_me glue_helper sb_edac ablk_helper cryptd mei sg ioatdma edac_core shpchp i2c_i801 pcspkr lpc_ich mfd_core vhost_net tun vhost macvtap macvlan vfio_pci ip_tables ext3 mbcache jbd usb_storage sd_mod crc_t10dif crct10dif_generic kvm_intel(O) kvm(O) irqbypass crct10dif_pclmul crct10dif_common crc32c_intel serio_raw igb ahci libahci i2c_algo_bit libata i2c_core dca megaraid_sas ptp pps_core dm_mod vfio_iommu_type1 vfio [last unloaded: signo_catch]
> [45029.121649] CPU: 10 PID: 18953 Comm: cat Tainted: G OEL ---- ------- 3.10.0-327.22.2.23.next.x86_64 #1
> [45029.121650] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Tecal BH622 V2/BC01SRSA0, BIOS RMISV019 05/10/2012
> [45029.121652] task: ffff880486c1e780 ti: ffff8804c5698000 task.ti: ffff8804c5698000
> [45029.121653] RIP: 0010:[<ffffffff81257c29>] [<ffffffff81257c29>] proc_sys_compare+0x49/0xd0
> [45029.121658] RSP: 0018:ffff8804c569bbb0 EFLAGS: 00000246
> [45029.121659] RAX: 0000000000000000 RBX: 1308000000000000 RCX: 0000000000000002
> [45029.121660] RDX: 0000000000000002 RSI: ffff8804921173b8 RDI: ffff8804a239f038
> [45029.121661] RBP: ffff8804c569bbc8 R08: 0000000000000063 R09: 0000000000000000
> [45029.121662] R10: 1308000000000000 R11: ffff880492117380 R12: ffff880a182c3b00
> [45029.121663] R13: ffff8804c569bbc8 R14: ffff88067ffdb008 R15: ffffffff8116db65
> [45029.121664] FS: 00007fae0d59f740(0000) GS:ffff8806676c0000(0000) knlGS:0000000000000000
> [45029.121665] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [45029.121666] CR2: 00007fae0d0a3540 CR3: 00000004a014d000 CR4: 00000000000407e0
> [45029.121667] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [45029.121668] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [45029.121669] Stack:
> [45029.121670] ffff880492117388 ffff88065ba2ccc0 ffff8804c569be60 ffff8804c569bc18
> [45029.121672] ffffffff811fa47c ffff880492117380 ffff8804a239f038 ffff880400000003
> [45029.121674] 0000000005d9368a ffff8804c569be60 ffff88065ba2ccc0 ffff8804c569bc97
> [45029.121676] Call Trace:
> [45029.121680] [<ffffffff811fa47c>] __d_lookup+0x14c/0x160
> [45029.121681] [<ffffffff811fa4ba>] d_lookup+0x2a/0x50
> [45029.121684] [<ffffffff811eb550>] lookup_dcache+0x30/0xb0
> [45029.121685] [<ffffffff811eb5fd>] __lookup_hash+0x2d/0x60
> [45029.121689] [<ffffffff81635670>] lookup_slow+0x42/0xa7
> [45029.121691] [<ffffffff811efc4f>] link_path_walk+0x83f/0x8e0
> [45029.121695] [<ffffffff812fc522>] ? radix_tree_lookup_slot+0x22/0x50
> [45029.121697] [<ffffffff811f0c93>] path_openat+0xa3/0x4c0
> [45029.121700] [<ffffffff81195151>] ? __do_fault+0x401/0x510
> [45029.121702] [<ffffffff811f24ab>] do_filp_open+0x4b/0xb0
> [45029.121705] [<ffffffff811ff017>] ? __alloc_fd+0xa7/0x130
> [45029.121707] [<ffffffff811dfdb3>] do_sys_open+0xf3/0x1f0
> [45029.121709] [<ffffffff811dfece>] SyS_open+0x1e/0x20
>
> Signed-off-by: Fengtiantian <fengtiantian@xxxxxxxxxx>
> ---
> fs/proc/proc_sysctl.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index 5e57c3e..4ee1093 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -850,6 +850,8 @@ static int proc_sys_compare(const struct dentry *parent, const struct dentry *de
> return 1;
> if (memcmp(name->name, str, len))
> return 1;
> + if (!PROC_I(dentry->d_inode)->sysctl->unregistering == 0)
> + return 0;
> head = rcu_dereference(PROC_I(inode)->sysctl);
> return !head || !sysctl_is_seen(head);
> }