Re: 2.6.32-rc6 BUG at mm/slab.c:2869!

From: Vegard Nossum
Date: Fri Aug 21 2009 - 04:28:46 EST


2009/8/21 Vegard Nossum <vegard.nossum@xxxxxxxxx>:
> 2009/8/21 Bob Copeland <me@xxxxxxxxxxxxxxx>:
>> On Thu, Aug 20, 2009 at 02:02:49PM +0200, Vegard Nossum wrote:
>>> > I'll try that and kmemcheck next.
>>>
>>> Hm, I'm afraid kmemcheck gives some known false positives related to
>>> bitfields in ext4 code, so in the case that something turned up, it
>>> might be hard to distinguish it from those false positives.
>>
>> Well I didn't get anything from ext4 so far. ÂI did hit one with
>> fsnotify:
>>
>> WARNING: kmemcheck: Caught 32-bit read from freed memory (f34a443c)
>> eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee008a06f700011000
>> Âa a a a a a a a a a a a a a a a a a a a a a a a f f f f f f f f
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ^
>>
>> Pid: 2745, comm: fsck.ext4 Not tainted (2.6.31-rc6 #2) MacBook1,1
>> EIP: 0060:[<c10f3656>] EFLAGS: 00010217 CPU: 0
>> EIP is at inotify_handle_event+0x76/0xc0
>> EAX: f34a443c EBX: f34a4438 ECX: 00000000 EDX: f6732000
>> ESI: f6559764 EDI: 00000000 EBP: f6733f0c ESP: c1527450
>> ÂDS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
>> CR0: 8005003b CR2: f6c046d4 CR3: 367fb000 CR4: 000026d0
>> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>> DR6: ffff4ff0 DR7: 00000400
>> Â[<c10f0d78>] fsnotify+0xa8/0x130
>> Â[<c10c5e11>] __fput+0xb1/0x1e0
>> Â[<c10c5f55>] fput+0x15/0x20
>> Â[<c10c2ca7>] filp_close+0x47/0x80
>> Â[<c10c2d54>] sys_close+0x74/0xc0
>> Â[<c1002ec8>] sysenter_do_call+0x12/0x36
>> Â[<ffffffff>] 0xffffffff
>>
>> I think that is list_empty() here where %eax is list_head
>> and event_list->next is the read location... which definitely
>> doesn't look like a pointer, if I'm reading it correctly.
>
> I think f34a443c is a valid pointer. On my machine, at least:
>
> [ Â Â0.004000] Â Â lowmem Â: 0xc0000000 - 0xf73fe000 Â ( 883 MB)
>
>>
>> inotify_fsnotify.o:
>>
>> Â Â Â Â/* did event_priv get attached? */
>> Â Â Â Âif (list_empty(&fsn_event_priv->event_list))
>> Â143:  8d 43 04        Âlea  Â0x4(%ebx),%eax
>>
>> Â Â Â Âevent_priv = kmem_cache_alloc(event_priv_cachep, GFP_KERNEL);
>> Â Â Â Âif (unlikely(!event_priv))
>> Â Â Â Â Â Â Â Âreturn -ENOMEM;
>>
>> Â Â Â Âfsn_event_priv = &event_priv->fsnotify_event_priv_data;
>> Â146:  39 43 04        Âcmp  Â%eax,0x4(%ebx)   <=== read here
>> Â149:  74 1d          je   168 <inotify_handle_event+0x98>
>
> I can see somewhat of a race, I think:
>
> 1. userspace calls inotify_read(), where we wait for something to happen:
>
> 249 Â Â Â Â while (1) {
> 250 Â Â Â Â Â Â Â Â prepare_to_wait(&group->notification_waitq, &wait,
> TASK_INTERRUPTIBLE);
> 251
> 252 Â Â Â Â Â Â Â Â mutex_lock(&group->notification_mutex);
> 253 Â Â Â Â Â Â Â Â kevent = get_one_event(group, count);
> 254 Â Â Â Â Â Â Â Â mutex_unlock(&group->notification_mutex);
>
> 2. an event occurs, and inotify_handle_event() calls
> fsnotify_add_notify_event():
>
> Â64 Â Â Â Â ret = fsnotify_add_notify_event(group, event, fsn_event_priv);
> Â65 Â Â Â Â /* EEXIST is not an error */
> Â66 Â Â Â Â if (ret == -EEXIST)
> Â67 Â Â Â Â Â Â Â Â ret = 0;
>
> 3. fsnotify_add_notify_event() adds the fsn_event_priv to the event,
> and adds the event to the group, and finally wakes up anybody who is
> waiting on &group->notification_waitq:
>
> 230 Â Â Â Â fsnotify_get_event(event);
> 231 Â Â Â Â list_add_tail(&holder->event_list, list);
> 232 Â Â Â Â if (priv)
> 233 Â Â Â Â Â Â Â Â list_add_tail(&priv->event_list, &event->private_data_list);
> 234 Â Â Â Â spin_unlock(&event->lock);
> 235 Â Â Â Â mutex_unlock(&group->notification_mutex);
> 236
> 237 Â Â Â Â wake_up(&group->notification_waitq);
>
> 4. inotify_read() wakes up and frees the event:
>
> 253 Â Â Â Â Â Â Â Â kevent = get_one_event(group, count);
>
> 5. inotify_handle_event() now dereferences the event_priv pointer,
> which was already freed:
>
> Â69 Â Â Â Â /* did event_priv get attached? */
> Â70 Â Â Â Â if (list_empty(&fsn_event_priv->event_list))
>
>
> I think that's it. Any thoughts? I put Eric Paris on Cc.

I guess it was fixed by this recently posted patch:

http://osdir.com/ml/linux-kernel/2009-08/msg05185.html

Was kmemcheck by any chance used to discover this race in the first place? ;-)


Vegard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/