Re: [LSF/MM TOPIC] FS, MM, and stable trees

From: Theodore Y. Ts'o
Date: Thu Feb 21 2019 - 13:52:52 EST


On Thu, Feb 21, 2019 at 07:34:15AM -0800, Luis Chamberlain wrote:
> On Sat, Feb 16, 2019 at 01:28:35PM -0500, Theodore Y. Ts'o wrote:
> > The block/*, loop/* and scsi/* tests in blktests do seem to be in
> > pretty good shape. The nvme, nvmeof, and srp tests are *definitely*
> > not as mature.
>
> Can you say more about this later part. What would you like to see more
> of for nvme tests for instance?
>
> It sounds like a productive session would include tracking our:
>
> a) sour spots
> b) who's already working on these
> c) gather volutneers for these sour spots

I complained on another LSF/MM topic thread, but there are a lot of
failures where it's not clear whether it's because I guessed
incorrectly about which version of nvme-cli I should be using (debian
stable and head of nvme-cli both are apparently wrong answers), or
kernel bugs or kernel misconfiguration issues on my side.

Current nvme/* failures that I'm still seeing are attached below.

- Ted

nvme/012 (run mkfs and data verification fio job on NVMeOF block device-backed ns) [failed]
runtime ... 100.265s
something found in dmesg:
[ 1857.188083] run blktests nvme/012 at 2019-02-12 01:11:33
[ 1857.437322] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 1857.456187] nvmet: creating controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:78dc695a-2c99-4841-968d-c2c16a49a02a.
[ 1857.458162] nvme nvme0: ANA group 1: optimized.
[ 1857.458257] nvme nvme0: creating 2 I/O queues.
[ 1857.460893] nvme nvme0: new ctrl: "blktests-subsystem-1"

[ 1857.720666] ============================================
[ 1857.726308] WARNING: possible recursive locking detected
[ 1857.731784] 5.0.0-rc3-xfstests-00014-g1236f7d60242 #843 Not tainted
...
(See '/results/nodev/nvme/012.dmesg' for the entire message)
nvme/013 (run mkfs and data verification fio job on NVMeOF file-backed ns) [failed]
runtime ... 32.634s
--- tests/nvme/013.out 2019-02-11 18:57:39.000000000 -0500
+++ /results/nodev/nvme/013.out.bad 2019-02-12 01:13:46.708757206 -0500
@@ -1,5 +1,9 @@
Running nvme/013
91fdba0d-f87b-4c25-b80f-db7be1418b9e
uuid.91fdba0d-f87b-4c25-b80f-db7be1418b9e
+fio: io_u error on file /mnt/blktests///verify.0.0: Input/output error: write offset=329326592, buflen=4096
+fio: io_u error on file /mnt/blktests///verify.0.0: Input/output error: write offset=467435520, buflen=4096
+fio exited with status 0
+4;fio-3.2;verify;0;5;0;0;0;0;0;0;0.000000;0.000000;0;0;0.000000;0.000000;1.000000%=0;5.000000%=0;10.000000%=0;20.000000%=0;30.000000%=0;40.000000%=0;50.000000%=0;60.000000%=0;70.000000%=0;80.000000%=0;90.000000%=0;95.000000%=0;99.000000%=0;99.500000%=0;99.900000%=0;99.950000%=0;99.990000%=0;0%=0;0%=0;0%=0;0;0;0.000000;0.000000;0;0;0.000000%;0.000000;0.000000;192672;6182;1546;31166;4;9044;63.332763;57.979218;482;29948;10268.332290;1421.459893;1.000000%=4145;5.000000%=9109;10.000000%=9502;20.000000%=9764;30.000000%=10027;40.000000%=10289;50.000000%=10420;60.000000%=10551;70.000000%=10682;80.000000%=10682;90.000000%=10944;95.000000%=11206;99.000000%=13172;99.500000%=16318;99.900000%=24510;99.950000%=27394;99.990000%=29229;0%=0;0%=0;0%=0;507;30005;10331.973087;1421.131712;6040;8232;100.000000%;6189.741935;267.091495;0;0;0;0;0;0;0.000000;0.000000;0;0;0.000000;0.000000;1.000000%=0;5.000000%=0;10.000000%=0;20.000000%=0;30.000000%=0;40.000000%=0;50.000000%=0;60.000000%=0;70.000000%=0;80.000000%=0;90.000000%=0;95.000000%=0;99.000000%=0;99.500000%=0;99.900000%=0;99.950000%=0;99.990000%=0;0%=0;0%=0;0%=0;0;0;0.000000;0.000000;0;0;0.000000%;0.000000;0.000000;3.991657%;6.484839%;91142;0;1024;0.1%;0.1%;0.1%;0.1%;100.0%;0.0%;0.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.01%;0.20%;0.34%;0.25%;0.19%;27.86%;70.89%;0.23%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;nvme0n1;0;0;0;0;0;0;0;0.00%
...
(Run 'diff -u tests/nvme/013.out /results/nodev/nvme/013.out.bad' to see the entire diff)
nvme/015 (unit test for NVMe flush for file backed ns) [failed]
runtime ... 8.914s
--- tests/nvme/015.out 2019-02-11 18:57:39.000000000 -0500
+++ /results/nodev/nvme/015.out.bad 2019-02-12 01:14:05.429328259 -0500
@@ -1,6 +1,6 @@
Running nvme/015
91fdba0d-f87b-4c25-b80f-db7be1418b9e
uuid.91fdba0d-f87b-4c25-b80f-db7be1418b9e
-NVMe Flush: success
+NVME IO command error:INTERNAL: The command was not completed successfully due to an internal error(6006)
NQN:blktests-subsystem-1 disconnected 1 controller(s)
Test complete
nvme/016 (create/delete many NVMeOF block device-backed ns and test discovery)
runtime ...
nvme/016 (create/delete many NVMeOF block device-backed ns and test discovery) [failed]
runtime ... 23.576s
--- tests/nvme/016.out 2019-02-11 18:57:39.000000000 -0500
+++ /results/nodev/nvme/016.out.bad 2019-02-12 01:14:29.173378854 -0500
@@ -1,11 +1,11 @@
Running nvme/016

-Discovery Log Number of Records 1, Generation counter 1
+Discovery Log Number of Records 1, Generation counter 5
=====Discovery Log Entry 0======
trtype: loop
adrfam: pci
...
(Run 'diff -u tests/nvme/016.out /results/nodev/nvme/016.out.bad' to see the entire diff)
nvme/017 (create/delete many file-ns and test discovery)
runtime ...
nvme/017 (create/delete many file-ns and test discovery) [failed]
runtime ... 23.592s
--- tests/nvme/017.out 2019-02-11 18:57:39.000000000 -0500
+++ /results/nodev/nvme/017.out.bad 2019-02-12 01:14:52.880762691 -0500
@@ -1,11 +1,11 @@
Running nvme/017

-Discovery Log Number of Records 1, Generation counter 1
+Discovery Log Number of Records 1, Generation counter 2
=====Discovery Log Entry 0======
trtype: loop
adrfam: pci
...
(Run 'diff -u tests/nvme/017.out /results/nodev/nvme/017.out.bad' to see the entire diff)