Re: [PATCH] fix a race between /proc/lock_stat and module unloading
From: Jerome Marchand
Date: Tue Jun 02 2015 - 05:54:34 EST
On 06/02/2015 11:30 AM, Peter Zijlstra wrote:
> On Fri, May 29, 2015 at 02:47:15PM +0200, Jerome Marchand wrote:
>> When opening /proc/lock_stat, lock_stat_open() makes a copy of
>> all_lock_classes list in the form of an array of ad hoc structures
>> lock_stat_data that reference lock_class, so it can be sorted and
>> passed to seq_read(). However, nothing prevents module unloading code
>> to free some of these lock_class structures before seq_read() tries to
>> access them.
>
> Well, how about lock_class being from a static array in lockdep.c:138 ?
>
>
I guess I jumped to conclusion here and my explanation is wrong. However
there is still a bug which occurs when the kernel tries to access
class->name is seq_stats:
[ 43.533732] BUG: unable to handle kernel paging request at
ffffffffa03181ce
[ 43.534006] IP: [<ffffffff8142b489>] strnlen+0x9/0x50
[ 43.534006] PGD 1e14067 PUD 1e15063 PMD 79153067 PTE 0
[ 43.534006] Oops: 0000 [#1] SMP
[ 43.534006] Modules linked in: ip6t_rpfilter ip6t_REJECT
nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc
ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw
ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security
iptable_raw ppdev iosf_mbi crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel serio_raw virtio_balloon virtio_console parport_pc
parport pvpanic i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc
virtio_blk virtio_net qxl drm_kms_helper ttm drm virtio_pci virtio_ring
virtio ata_generic pata_acpi [last unloaded: zram]
[ 43.534006] CPU: 0 PID: 2125 Comm: cat Not tainted
3.19.4-200.fc21.x86_64+debug #1
[ 43.534006] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 43.534006] task: ffff88007a468000 ti: ffff88007a374000 task.ti:
ffff88007a374000
[ 43.534006] RIP: 0010:[<ffffffff8142b489>] [<ffffffff8142b489>]
strnlen+0x9/0x50
[ 43.534006] RSP: 0018:ffff88007a377ba8 EFLAGS: 00010286
[ 43.534006] RAX: ffffffff81c7ac25 RBX: ffff88007a377ce9 RCX:
0000000000000000
[ 43.534006] RDX: ffffffffa03181ce RSI: ffffffffffffffff RDI:
ffffffffa03181ce
[ 43.534006] RBP: ffff88007a377ba8 R08: 000000000000ffff R09:
000000000000ffff
[ 43.534006] R10: 0000000000000000 R11: 0000000000000000 R12:
ffffffffa03181ce
[ 43.534006] R13: ffff88007a377d0f R14: 00000000ffffffff R15:
0000000000000000
[ 43.534006] FS: 00007f6e132f3700(0000) GS:ffff88007d200000(0000)
knlGS:0000000000000000
[ 43.534006] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 43.534006] CR2: ffffffffa03181ce CR3: 000000007b777000 CR4:
00000000001406f0
[ 43.534006] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 43.534006] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 43.534006] Stack:
[ 43.534006] ffff88007a377be8 ffffffff8142d92f ffff88007a377c81
ffff88007a377ce9
[ 43.534006] ffff88007a377d0f ffff88007a377c78 ffffffff81cd0158
ffffffff81cd0158
[ 43.534006] ffff88007a377c68 ffffffff8142f0d9 ffff88007a377ce9
ffff88007b128800
[ 43.534006] Call Trace:
[ 43.534006] [<ffffffff8142d92f>] string.isra.7+0x3f/0xf0
[ 43.534006] [<ffffffff8142f0d9>] vsnprintf+0x199/0x5b0
[ 43.534006] [<ffffffff812a329c>] ? seq_printf+0x4c/0x70
[ 43.534006] [<ffffffff8142f593>] snprintf+0x43/0x60
[ 43.534006] [<ffffffff812a33c8>] ? seq_puts+0x48/0x70
[ 43.534006] [<ffffffff8111278c>] seq_stats+0x7c/0x520
[ 43.534006] [<ffffffff8110db4c>] ? mark_held_locks+0x7c/0xb0
[ 43.534006] [<ffffffff8187486c>] ? mutex_lock_nested+0x28c/0x440
[ 43.534006] [<ffffffff8110dcbd>] ? trace_hardirqs_on_caller+0x13d/0x1e0
[ 43.534006] [<ffffffff81112c47>] ls_show+0x17/0x120
[ 43.534006] [<ffffffff812c1822>] ? fsnotify+0x462/0x820
[ 43.534006] [<ffffffff812c1458>] ? fsnotify+0x98/0x820
[ 43.534006] [<ffffffff812a2cf6>] seq_read+0x316/0x400
[ 43.534006] [<ffffffff812f06a8>] proc_reg_read+0x48/0x70
[ 43.534006] [<ffffffff81277378>] __vfs_read+0x18/0x50
[ 43.534006] [<ffffffff8127743d>] vfs_read+0x8d/0x150
[ 43.534006] [<ffffffff8127755c>] SyS_read+0x5c/0xd0
[ 43.534006] [<ffffffff81879389>] system_call_fastpath+0x12/0x17
[ 43.534006] Code: 40 00 48 83 c0 01 80 38 00 75 f7 48 29 f8 5d c3 31
c0 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 85 f6 48 89 e5
74 3f <80> 3f 00 74 3a 48 8d 47 01 48 01 fe eb 13 66 0f 1f 84 00 00 00
[ 43.534006] RIP [<ffffffff8142b489>] strnlen+0x9/0x50
[ 43.534006] RSP <ffff88007a377ba8>
[ 43.534006] CR2: ffffffffa03181ce
[ 43.534006] ---[ end trace 609a4a4bd210562d ]---
So I guess it's actually just class->name that get freed underneath us.
The following script easily triggers the bug unless my patch is applied:
#! /bin/sh
while true; do
modprobe zram;
modprobe -r zram;
done &
while true; do
cat /proc/lock_stat > /dev/null ;
done
Attachment:
signature.asc
Description: OpenPGP digital signature