Re: [PATCH v5 2/2] writeback, cgroup: release dying cgwbs by switching attached inodes

From: Ming Lei
Date: Thu May 27 2021 - 22:58:23 EST


On Wed, May 26, 2021 at 03:25:57PM -0700, Roman Gushchin wrote:
> Asynchronously try to release dying cgwbs by switching clean attached
> inodes to the bdi's wb. It helps to get rid of per-cgroup writeback
> structures themselves and of pinned memory and block cgroups, which
> are way larger structures (mostly due to large per-cpu statistics
> data). It helps to prevent memory waste and different scalability
> problems caused by large piles of dying cgroups.
>
> A cgwb cleanup operation can fail due to different reasons (e.g. the
> cgwb has in-glight/pending io, an attached inode is locked or isn't
> clean, etc). In this case the next scheduled cleanup will make a new
> attempt. An attempt is made each time a new cgwb is offlined (in other
> words a memcg and/or a blkcg is deleted by a user). In the future an
> additional attempt scheduled by a timer can be implemented.
>
> Signed-off-by: Roman Gushchin <guro@xxxxxx>
> ---
> fs/fs-writeback.c | 35 ++++++++++++++++++
> include/linux/backing-dev-defs.h | 1 +
> include/linux/writeback.h | 1 +
> mm/backing-dev.c | 61 ++++++++++++++++++++++++++++++--
> 4 files changed, 96 insertions(+), 2 deletions(-)
>

Hello Roman,

The following kernel panic is triggered by this patch:

[root@ktest-01 xfstests-dev]# ./check generic/563
[ 47.186811] SGI XFS with ACLs, security attributes, realtime, verbose warnings, quota, no debug enabled
[ 47.190152] XFS (sdb): Mounting V5 Filesystem
[ 47.201551] XFS (sdb): Ending clean mount
[ 47.205501] xfs filesystem being mounted at /mnt/test supports timestamps until 2038 (0x7fffffff)
FSTYP -- xfs (non-debug)
PLATFORM -- Linux/x86_64 ktest-01 5.13.0-rc3+ #294 SMP Fri May 28 10:51:02 CST 2021
MKFS_OPTIONS -- -f -bsize=4096 /dev/sda
MOUNT_OPTIONS -- /dev/sda /mnt/scratch

[ 47.431775] XFS (sda): Mounting V5 Filesystem
[ 47.441731] XFS (sda): Ending clean mount
[ 47.445080] xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff)
[ 47.449189] XFS (sda): Unmounting Filesystem
[ 47.473863] XFS (sdb): Unmounting Filesystem
[ 47.614561] XFS (sdb): Mounting V5 Filesystem
[ 47.628670] XFS (sdb): Ending clean mount
[ 47.631904] xfs filesystem being mounted at /mnt/test supports timestamps until 2038 (0x7fffffff)
generic/563 1s ... [ 47.661393] run fstests generic/563 at 2021-05-28 02:54:59
[ 47.947414] loop0: detected capacity change from 0 to 16777216
[ 48.034564] XFS (loop0): Mounting V5 Filesystem
[ 48.069959] XFS (loop0): Ending clean mount
[ 48.070726] xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff)
[ 48.132314] XFS (loop0): Unmounting Filesystem
[ 48.204548] XFS (loop0): Mounting V5 Filesystem
[ 48.215500] XFS (loop0): Ending clean mount
[ 48.219223] xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff)
[ 48.534420] XFS (loop0): Unmounting Filesystem
[ 48.535142] ------------[ cut here ]------------
[ 48.535921] WARNING: CPU: 3 PID: 114 at mm/backing-dev.c:402 cgwb_release_workfn+0xa4/0xd8
[ 48.537461] Modules linked in: xfs libcrc32c iTCO_wdt i2c_i801 iTCO_vendor_support nvme i2c_smbus lpc_ich usb_storage i2c_s
[ 48.540613] CPU: 3 PID: 114 Comm: kworker/3:1 Not tainted 5.13.0-rc3+ #294
[ 48.541927] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014
[ 48.543439] Workqueue: cgwb_release cgwb_release_workfn
[ 48.544365] RIP: 0010:cgwb_release_workfn+0xa4/0xd8
[ 48.545185] Code: 00 00 00 48 85 db 75 d5 48 8d 7d 80 e8 98 71 20 00 48 8d bd 70 ff ff ff e8 36 7b 1d 00 48 8b 55 f0 48 8d4
[ 48.548935] RSP: 0018:ffffc90001f47e88 EFLAGS: 00010202
[ 48.549844] RAX: ffff88810321d280 RBX: ffffffff82f69ac0 RCX: 0000000080400011
[ 48.552645] RDX: ffffffff82669f00 RSI: 0000000000210d00 RDI: ffff888100042500
[ 48.553935] RBP: ffff88810321d290 R08: 0000000000000001 R09: ffffffff811c8754
[ 48.555054] R10: 000000000000005e R11: 0000000000000046 R12: ffff88810321d000
[ 48.556183] R13: ffff88815c4f2300 R14: 0000000000000000 R15: 0000000000000000
[ 48.557116] FS: 0000000000000000(0000) GS:ffff88815c4c0000(0000) knlGS:0000000000000000
[ 48.558131] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 48.558818] CR2: 00007f3aba17f9c0 CR3: 0000000108e86004 CR4: 0000000000370ee0
[ 48.559950] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 48.561057] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 48.562075] Call Trace:
[ 48.562421] elfcorehdr_read+0xf/0xf
[ 48.562920] ? worker_thread+0x117/0x1b9
[ 48.563443] ? rescuer_thread+0x291/0x291
[ 48.564001] ? kthread+0xec/0xf4
[ 48.564411] ? kthread_create_worker_on_cpu+0x65/0x65
[ 48.565086] ? ret_from_fork+0x1f/0x30
[ 48.565594] ---[ end trace bdeef00aa75cca5c ]---
[ 48.601694] XFS (loop0): Mounting V5 Filesystem
[ 48.605863] XFS (loop0): Ending clean mount
[ 48.607129] xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff)
[ 48.830734] general protection fault, probably for non-canonical address 0xffff11033f71f000: 0000 [#1] SMP NOPTI
[ 48.832720] CPU: 10 PID: 234 Comm: kworker/10:1 Tainted: G W 5.13.0-rc3+ #294
[ 48.833932] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014
[ 48.835146] Workqueue: events cleanup_offline_cgwbs_workfn
[ 48.835952] RIP: 0010:percpu_ref_tryget_many.constprop.0+0x12/0x43
[ 48.836849] Code: 48 8b 47 08 48 8b 40 08 ff d0 0f 1f 00 eb 04 65 48 ff 08 e9 25 fd ff ff 41 54 48 8b 07 a8 03 74 09 48 8b4
[ 48.839494] RSP: 0018:ffffc90001383e58 EFLAGS: 00010046
[ 48.840246] RAX: ffff88810321f000 RBX: ffff88810321d280 RCX: ffff8881016f9770
[ 48.841165] RDX: ffffffff82669f00 RSI: 0000000000000280 RDI: ffff88810321d200
[ 48.842050] RBP: ffffc90001383e68 R08: ffff88810006c8b0 R09: 000073746e657665
[ 48.843224] R10: 8080808080808080 R11: fefefefefefefeff R12: ffff88810321d200
[ 48.844133] R13: ffff88810321d000 R14: 0000000000000000 R15: 0000000000000000
[ 48.845022] FS: 0000000000000000(0000) GS:ffff88823c500000(0000) knlGS:0000000000000000
[ 48.845903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 48.846495] CR2: 00007fb630166198 CR3: 0000000179a1e006 CR4: 0000000000370ee0
[ 48.847414] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 48.848405] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 48.849126] Call Trace:
[ 48.849388] cleanup_offline_cgwbs_workfn+0x8a/0x14c
[ 48.849906] process_one_work+0x15c/0x234
[ 48.850346] worker_thread+0x117/0x1b9
[ 48.850706] ? rescuer_thread+0x291/0x291
[ 48.851065] kthread+0xec/0xf4
[ 48.851346] ? kthread_create_worker_on_cpu+0x65/0x65
[ 48.851815] ret_from_fork+0x1f/0x30
[ 48.852151] Modules linked in: xfs libcrc32c iTCO_wdt i2c_i801 iTCO_vendor_support nvme i2c_smbus lpc_ich usb_storage i2c_s
[ 48.854166] Dumping ftrace buffer:
[ 48.854546] (ftrace buffer empty)
[ 48.854909] ---[ end trace bdeef00aa75cca5d ]---



Thanks,
Ming