Re: Any known soft lockup issue with vfs_write()->fsnotify()?
From: Jan Kara
Date: Mon Mar 05 2018 - 15:48:45 EST
Hi!
On Fri 02-03-18 22:28:50, Dexuan Cui wrote:
> Recently people are getting a soft lock issue with vfs_write()->fsnotify().
> The detailed calltrace is available at:
> https://github.com/coreos/bugs/issues/2356
> https://github.com/coreos/bugs/issues/2364
I didn't see them yet.
> The kernel versions showing up the issue are:
> 4.14.11-coreos
> 4.14.19-coreos
> 4.13.0-1009 -- this is the kernel with which I'm personally seeing the lockup.
>
> I have not got a chance to try the latest mainline kernel yet.
It would be good to try 4.15 kernel to see whether recent fixes from Miklos
didn't fix your problem. They should be present in 4.14.11/19 kernels as
well but one never knows...
> Before the lockup error message suddenly appears, Linux has been running
> fine for many hours. I have NOT found a consistent way to reproduce the
> lockup yet.
>
> Looks the kernel is stuck in fsnotify(), when it tries to get the
> fsnotify_mark_srcu lock.
It is not possible that we would 'hang' in srcu_read_lock() - that is
just a read of one variable and increment of another. We'd have to be
looping somewhere and watchdog would have to happen to hit us always at
that place. Weird. Are you sure RIP points to srcu_read_lock?
> "git log fs/notify/fsnotify.c" on the latest mainline shows that some
> recent patches might help.
>
> I'd like to check if this is a known issue.
As I've mentioned above, so far I didn't see reports like this...
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR