Fuse Kernel Freeze!

From: Aleksander Wieliczko
Date: Fri Jan 22 2016 - 05:16:34 EST


Hi,

We have a problem with software running on filesystem which uses FUSE.
We are able to repeat the behaviour when a particular piece of software
freezes after a few seconds.

We notice that this problem appears only on Linux kernels newer than
3.15.x and any FUSE filesystem. For example you may use SSHFS to repeat
this.

When we are running this software on standard hard disk, it works
correctly even on newest kernels.
I can add that we have tested kernels from 3.16.X to 4.2.X and in all
of them, the problem exists.

To repeat this error, please:
- download:
http://spades.bioinf.spbau.ru/release3.6.2/SPAdes-3.6.2-Linux.tar.gz

- extract tar file inside mounted SSHFS folder

- run program:
sshfs-mounted-folder/SPAdes-3.6.2-Linux/bin/spades.py --test

- after 2-3 seconds program will freeze. In our case it looks like this:

=== ITERATION 0 begins ===
0:00:00.005 4M / 4M INFO K-mer Index Building
(kmer_index.hpp : 464) Building kmer index
0:00:00.005 4M / 4M INFO K-mer Splitting
(kmer_data.cpp : 141) Splitting kmer instances into 16
buckets. This might take a while.
0:00:00.005 4M / 4M INFO General
(file_limit.hpp : 30) Open file limit set to 1024
0:00:00.005 4M / 4M INFO K-mer Splitting
(kmer_data.cpp : 159) Memory available for splitting
buffers: 83.332 Gb
0:00:00.005 4M / 4M INFO K-mer Splitting
(kmer_data.cpp : 167) Using cell size of 4194304
0:00:00.005 580M / 580M INFO K-mer Splitting
(kmer_data.cpp : 181) Processing
/mnt/mfs/SPAdes-3.6.2-Linux/share/spades/test_dataset/ecoli_1K_1.fq.gz
0:00:00.515 580M / 580M INFO K-mer Splitting
(kmer_data.cpp : 181) Processing
/mnt/mfs/SPAdes-3.6.2-Linux/share/spades/test_dataset/ecoli_1K_2.fq.gz
0:00:00.574 580M / 580M INFO K-mer Splitting
(kmer_data.cpp : 195) Processed 4108 reads
0:00:00.574 4M / 580M INFO General
(kmer_index.hpp : 342) Starting k-mer counting.
0:00:01.050 4M / 580M INFO General
(kmer_index.hpp : 348) K-mer counting done. There are 1974
kmers in total.
0:00:01.050 4M / 580M INFO General
(kmer_index.hpp : 350) Merging temporary buckets.
...
...
...

Our configuration:
- Ubuntu 14.04.1 x86_64
- kernel 3.19.0-43-generic
- libfuse 2.9.4

Kernel log:
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116703] INFO: task
python:6559 blocked for more than 120 seconds.
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116848] Not tainted
3.19.0-43-generic #49~14.04.1-Ubuntu
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116897] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116958] python D
ffff8801754bbd88 0 6559 3837 0x00000000
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116963] ffff8801754bbd88
ffff88007fab0000 0000000000013e80 ffff8801754bbfd8
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116965] 0000000000013e80
ffff880172b24e80 ffff88007fab0000 0000000000000000
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116967] ffff8800b0dd29a8
ffff8800b0dd29ac ffff88007fab0000 00000000ffffffff
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116969] Call Trace:
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116978]
[<ffffffff817b3879>] schedule_preempt_disabled+0x29/0x70
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116981]
[<ffffffff817b5565>] __mutex_lock_slowpath+0x95/0x100
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116983]
[<ffffffff817b55f3>] mutex_lock+0x23/0x37
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116987]
[<ffffffff811fa4ad>] vfs_unlink+0x4d/0x170
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116990]
[<ffffffff811fcd9b>] do_unlinkat+0x2bb/0x300
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116995]
[<ffffffff81015cd3>] ? do_device_not_available+0x33/0x70
Jan 12 11:38:47 blueocean252 kernel: [ 5885.116998]
[<ffffffff811fde46>] SyS_unlink+0x16/0x20
Jan 12 11:38:47 blueocean252 kernel: [ 5885.117000]
[<ffffffff817b770d>] system_call_fastpath+0x16/0x1b

Best regards
Aleksander Wieliczko
Technical Support Engineer
MooseFS.com