Re: [Bisected] Regression: cpu stuck in gvfsd-fuse, can't shutdown
From: Thierry Reding
Date: Wed Nov 12 2014 - 06:27:00 EST
On Tue, Nov 11, 2014 at 11:44:26PM +0200, Giedrius Statkevicius wrote:
> On 2014.11.11 23:05, Greg KH wrote:
> >
> > If you revert this patch, does things go back to "normal" for you?
>
> Originally I've only tested where the HEAD was
> 32eca22180804f71b06b63fd29b72f58be8b3c47 versus
> 32eca22180804f71b06b63fd29b72f58be8b3c47~1 but now I recompiled and
> tested a vanilla 3.18.0-rc4-next-20141111 on which this issue occurs and
> then tried a version with that particular patch reverted and then no
> lockups happen.
I've run into this same issue with sshfs:
[ 49.231095] BUG: spinlock bad magic on CPU#1, sshfs/180
[ 49.239078] lock: fuse_miscdevice+0x0/0x24, .magic: c09ce64c, .owner: /0, .owner_cpu: -1065526976
[ 49.248551] CPU: 1 PID: 180 Comm: sshfs Not tainted 3.18.0-rc4-next-20141111-00275-g3eeaa958e58c-dirty #2654
[ 49.258443] [<c00161f8>] (unwind_backtrace) from [<c0011a88>] (show_stack+0x10/0x14)
[ 49.266269] [<c0011a88>] (show_stack) from [<c07b50b4>] (dump_stack+0x98/0xd8)
[ 49.273618] [<c07b50b4>] (dump_stack) from [<c0068670>] (do_raw_spin_lock+0x1a4/0x1a8)
[ 49.281621] [<c0068670>] (do_raw_spin_lock) from [<c022a0f0>] (fuse_dev_release+0x1c/0x68)
[ 49.289900] [<c022a0f0>] (fuse_dev_release) from [<c00f5078>] (__fput+0x80/0x1c8)
[ 49.297470] [<c00f5078>] (__fput) from [<c003fc38>] (task_work_run+0xb4/0xec)
[ 49.304700] [<c003fc38>] (task_work_run) from [<c001140c>] (do_work_pending+0xa0/0xc0)
[ 49.312712] [<c001140c>] (do_work_pending) from [<c000e5e0>] (work_pending+0xc/0x20)
[ 49.701449] BUG: spinlock lockup suspected on CPU#1, sshfs/180
[ 49.707327] lock: fuse_miscdevice+0x0/0x24, .magic: c09ce64c, .owner: /0, .owner_cpu: -1065526976
[ 49.716341] CPU: 1 PID: 180 Comm: sshfs Not tainted 3.18.0-rc4-next-20141111-00275-g3eeaa958e58c-dirty #2654
[ 49.726238] [<c00161f8>] (unwind_backtrace) from [<c0011a88>] (show_stack+0x10/0x14)
[ 49.734051] [<c0011a88>] (show_stack) from [<c07b50b4>] (dump_stack+0x98/0xd8)
[ 49.741293] [<c07b50b4>] (dump_stack) from [<c00685c8>] (do_raw_spin_lock+0xfc/0x1a8)
[ 49.749178] [<c00685c8>] (do_raw_spin_lock) from [<c022a0f0>] (fuse_dev_release+0x1c/0x68)
[ 49.757508] [<c022a0f0>] (fuse_dev_release) from [<c00f5078>] (__fput+0x80/0x1c8)
[ 49.765058] [<c00f5078>] (__fput) from [<c003fc38>] (task_work_run+0xb4/0xec)
[ 49.772264] [<c003fc38>] (task_work_run) from [<c001140c>] (do_work_pending+0xa0/0xc0)
[ 49.780197] [<c001140c>] (do_work_pending) from [<c000e5e0>] (work_pending+0xc/0x20)
Reverting 32eca2218080 ("misc: always assign miscdevice to file->
private_data in open()") fixes the issue for me.
Looking at the stacktrace and correlating to the code, what happens is
that fuse_fill_super() checks that file->private_data hasn't been set
yet and errors out otherwise. Clearly this is what the misc_open()
change in the above commit triggers.
The BUG ensuing from that comes from the fact that the error cleanup
path assumes that if file->private_data is set, it will be a struct
fuse_conn *, so it's not a surprise that fuse_dev_release() will fail
as above.
The root of the issue is that the assumption in the above commit, that
drivers will always overwrite ->private_data, isn't true at least in
case of FUSE.
Thierry
Attachment:
pgpVtUeLJZPdr.pgp
Description: PGP signature