Re: frequent lockups in 3.18rc4

From: Chris Mason
Date: Mon Dec 01 2014 - 18:44:55 EST




On Mon, Dec 1, 2014 at 6:25 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
On Mon, Dec 1, 2014 at 3:08 PM, Chris Mason <clm@xxxxxx> wrote:
I'm not sure if this is related, but running trinity here, I noticed it
was stuck at 100% system time on every CPU. perf report tells me we are
spending all of our time in spin_lock under the sync system call.

I think it's coming from contention in the bdi_queue_work() call from
inside sync_inodes_sb, which is spin_lock_bh().

Please do a perf run with -g to get the call chain to make sure..

The call chain goes something like this:

--- _raw_spin_lock
|
|--99.72%-- sync_inodes_sb
| sync_inodes_one_sb
| iterate_supers
| sys_sync
| |
| |--79.66%-- system_call_fastpath
| | syscall
| |
| --20.34%-- ia32_sysret
| __do_syscall
--0.28%-- [...]

(the 64bit call variation is similar) Adding -v doesn't really help, because it isn't giving me the address inside sync_inodes_sb()

I first read this and guessed it must be leaving out the call to bdi_queue_work, hoping the spin_lock_bh and lock debugging were teaming up to stall the box.

But looking harder it's probably inside wait_sb_inodes:

spin_lock(&inode_sb_list_lock);

Which is a little harder to blame. Maaaaaybe with lock debugging, but its enough of a stretch that I wouldn't have emailed at all if I hadn't fixated on the bdi code.

-chris



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/