XFS mounted with 'discard' option - deleting fio test files slow

From: Lu, Qian
Date: Thu Sep 07 2017 - 14:23:47 EST


Hi XFS mailing list,
Â
Recently we received a bug report in the XFS filesystem with 'discard' option. I have been able to reproduce this issue. I used XFS filesystem to format NVMe SSD and mounted with 'discard' option. When I tried to delete the test fio files, the session took long time. This issue is based on Linux 4.9 stable tree. I have also repeated this test with Linux 4.13, 4.12, and we are facing the same issue. Tests were repeated several times and it was consistent.
Â
Please see details below.
Â
1. Kernel version: Linux ip-172-31-6-243 4.9.32-15.41.amzn1.x86_64 #1 SMP Thu Jun 22 06:20:54 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Â
# fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=5G --numjobs=8 --group_reporting
--> Interrupt with Ctrl+C
# time rm -rf fio_test_file.*
--> The session hangs and in 'blocked' state
$ dmesg
...
[Â 492.329896] INFO: task rm:9231 blocked for more than 120 seconds.
...
Â
Then I tried to backport some patches and repeated the test. The issue has been improved. Eventually 'rm' command completed but took long time(2min).
Â
 * Backported patch: 4560e78 xfs: don't block the log commit handler for discards
Â
Â# fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=5G --numjobs=8 --group_reporting
Â--> Interrupt with Ctrl+C
Â# time rm -rf fio_test_file.*
Âreal 2m2.242s
Âuser 0m0.000s
ÂsysÂÂ 0m25.524s
Â
Â
2.ÂÂÂ With Linux 4.12 and 4.13.0-rc1, the issue has been improved and the command is not stuck. But 'rm' command still takes long time (more than 1min). Please see details below.
Â
Kernel version: Linux version 4.13.0-rc1+ (ec2-user@ip-172-31-21-25) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC)) #1 SMP Fri Jul 21 17:31:06 UTC 2017
Â
# fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=5G --numjobs=8 --group_reporting
--> Interrupt at about 37%
# time rm -rf fio_test_file.*
real 1m57.912s
user 0m0.000s
sysÂÂ 0m28.810s
Â
Compare this result with:
ÂÂÂÂa)Â XFS mounted with 'nodiscard' option: It took less than 1min to run 'rm' command.
ÂÂÂÂ
ÂÂÂÂÂ# fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=5G --numjobs=8 --group_reporting
ÂÂÂÂ--> Interrupt at about 39%
ÂÂÂÂÂ# time rm -rf fio_test_file.*
ÂÂÂÂrealÂÂÂÂ 0m31.176s
ÂÂÂÂ userÂÂÂÂ 0m0.000s
ÂÂÂÂ sysÂÂÂÂÂ 0m30.005s
Â
ÂÂÂÂÂb)ÂÂÂ EXT4 file system mounted with 'discard' option: It only took about several seconds to run 'rm' command.
ÂÂÂÂÂ
ÂÂÂÂÂÂ# fio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=5G --numjobs=8 --group_reporting
ÂÂÂÂÂ--> Interrupt at about 36.2%
ÂÂÂÂÂ# time rm -rf fio_test_file.*
ÂÂÂÂÂrealÂÂÂÂ 0m4.661s
ÂÂÂÂÂuserÂÂÂÂ 0m0.000s
ÂÂÂÂÂsysÂÂÂÂÂ 0m4.657s
Â
Please note if I wait for 'fio' command 100% done, 'rm' command took less than 1s (0m0.001s).
Â
Â
3.ÂÂ Shell script which triggers the problem
Â
ÂÂsudo su -
ÂÂyum install xfsprogs fio -y
ÂÂmkfs.xfs -K -f -s size=4096 /dev/nvme0n1
ÂÂmkdir -p /media/disk1
ÂÂmount -o discard /dev/nvme0n1 /media/disk1
ÂÂcd /media/disk1/
ÂÂfio --name fio_test_file --direct=1 --rw=randwrite --bs=4k --size=5G --numjobs=8 --group_reporting
ÂÂ # Interrupt with Ctrl+C
ÂÂtime rm -rf fio_test_file.*
Â
Â
Best Regards,
Qian Lu