Re: Soft lockups during reading /proc/PID/smaps

From: Aleksei Besogonov
Date: Thu Jul 31 2014 - 07:13:50 EST


On 31 Jul 2014, at 00:43, David Rientjes <rientjes@xxxxxxxxxx> wrote:
> On Thu, 31 Jul 2014, Aleksei Besogonov wrote:
>> I'm getting weird soft lockups while reading smaps on loaded systems with
>> some background cgroups usage. This issue can be reproduced with the most
>> recent kernel.
>>
>> Here's the stack trace:
>> [ 1748.312052] BUG: soft lockup - CPU#6 stuck for 23s! [python2.7:1857]
>> [ 1748.312052] Modules linked in: xfs xt_addrtype xt_conntrack
>> iptable_filter ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
>> nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables bridge stp llc
>> dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c nfsd
>> auth_rpcgss nfs_acl nfs lockd sunrpc fscache dm_crypt psmouse serio_raw
>> ppdev parport_pc i2c_piix4 parport xen_fbfront fb_sys_fops syscopyarea
>> sysfillrect sysimgblt mac_hid isofs raid10 raid456 async_memcpy
>> async_raid6_recov async_pq async_xor async_tx xor raid6_pq raid1 raid0
>> multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
>> aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd floppy
>> [ 1748.312052] CPU: 6 PID: 1857 Comm: python2.7 Not tainted
>> 3.15.5-031505-generic #201407091543
> This isn't the most recent kernel, we're at 3.16-rc7 now, but I don't
> think there are any changes that would prevent this.
Yes, I tested it with the rc7, the error report is from a previous run with an Ubuntu 14.04 kernel.

> The while_each_thread() in vm_is_stack() looks suspicious since the task
> isn't current and rcu won't protect the iteration, and we also don't hold
> sighand lock or a readlock on tasklist_lock.
> I think Oleg will know how to proceed, cc'd.
I’m attaching a minimal test case that can reproduce the issue. Works in 100% cases on any system I’ve tried.

#!/usr/bin/env python2.7
from os import mkdir
from threading import Thread
from time import sleep
import os

__author__ = 'cyberax'

count = 0


def threadproc():
global count
count += 1
sleep(0.01)
count -= 1


def do_threads():
sleep(2)
while True:
while count > 200:
sleep(0.01)

th = Thread(target=threadproc)
th.start()


def do_reader(pid):
while True:
with open("/sys/fs/cgroup/memory/ck/1001/tasks", "r") as fl:
fl.readlines()
with open("/sys/fs/cgroup/memory/ck/1001/delegate/tasks", "r") as fl:
lines = fl.readlines()
for l in lines:
try:
with open("/proc/%s/smaps" % l.strip(), "r") as fl:
fl.readlines()
except:
pass

pid = os.fork()
if pid == 0:
do_threads()
exit(0)

try:
mkdir('/sys/fs/cgroup/memory/ck')
mkdir('/sys/fs/cgroup/memory/ck/1001')
mkdir('/sys/fs/cgroup/memory/ck/1001/delegate')
except:
pass

with open('/sys/fs/cgroup/memory/ck/1001/delegate/tasks', 'w') as fl:
fl.write('%d\n' % pid)

do_reader(pid)