Re: BUG: unable to handle kernel NULL pointer dereference in rb_insert_color

From: Eric Biggers
Date: Tue Jan 30 2018 - 16:43:37 EST


On Wed, Dec 20, 2017 at 09:05:39AM +0100, Dmitry Vyukov wrote:
> On Wed, Dec 20, 2017 at 8:59 AM, Eric Biggers <ebiggers3@xxxxxxxxx> wrote:
> > On Wed, Dec 20, 2017 at 08:50:40AM +0100, Dmitry Vyukov wrote:
> >> >
> >> > The line number in lib/rbtree.c seems to be slightly off. Looking at the
> >> > disassembly:
> >> >
> >> > ffffffff825b5ea0 <rb_insert_color>:
> >> > ffffffff825b5ea0: 55 push %rbp
> >> > ffffffff825b5ea1: 48 8b 17 mov (%rdi),%rdx
> >> > ffffffff825b5ea4: 48 89 e5 mov %rsp,%rbp
> >> > ffffffff825b5ea7: 48 85 d2 test %rdx,%rdx
> >> > ffffffff825b5eaa: 0f 84 4c 01 00 00 je ffffffff825b5ffc <rb_insert_color+0x15c>
> >> > ffffffff825b5eb0: 48 8b 02 mov (%rdx),%rax
> >> > ffffffff825b5eb3: a8 01 test $0x1,%al
> >> > ffffffff825b5eb5: 75 5e jne ffffffff825b5f15 <rb_insert_color+0x75>
> >> > ffffffff825b5eb7: 48 8b 48 08 mov 0x8(%rax),%rcx
> >> >
> >> > It crashed on 'mov 0x8(%rax),%rcx' which corresponds to
> >> > 'tmp = gparent->rb_right;' at lib/rbtree.c:131. So 'parent' was the root node,
> >> > but its color was red, while it is supposed to be black.
> >> >
> >> > No idea how that happened, but it's almost certainly not an ext4 bug. In fact
> >> > there is another report of this same crash that has a different call trace:
> >> >
> >> > Call Trace:
> >> > key_alloc_serial security/keys/key.c:170 [inline]
> >> > key_alloc+0x54c/0x5b0 security/keys/key.c:319
> >> > keyring_alloc+0x4d/0xb0 security/keys/keyring.c:503
> >> > install_process_keyring_to_cred.part.3+0x38/0x80 security/keys/process_keys.c:192
> >> > install_process_keyring_to_cred security/keys/process_keys.c:634 [inline]
> >> > install_process_keyring security/keys/process_keys.c:217 [inline]
> >> > lookup_user_key+0x4ed/0x7c0 security/keys/process_keys.c:574
> >> > SYSC_add_key security/keys/keyctl.c:114 [inline]
> >> > SyS_add_key+0xec/0x260 security/keys/keyctl.c:62
> >> > entry_SYSCALL_64_fastpath+0x1f/0x96
> >>
> >>
> >> My first hypothesis for an non-explainable, non-reproducible
> >> corruption would be a data race. Is there all locking in place?
> >
> > It doesn't seem to be a locking problem. In the ext4 case the rbtree is
> > associated with a struct file's dir_private_info, which is protected by
> > ->f_pos_lock (taken early in sys_getdents()).
>
> But this won't prevent somebody else to mess with the struct without
> taking the lock.
>
> > And in the keyrings case, the
> > rbtree is protected by key_serial_lock.

Invalidating this bug since it hasn't been seen again, and it was reported while
KASAN was accidentally disabled in the syzbot config due to a change to the
kconfig menus in linux-next (so this crash was probably caused by slab
corruption elsewhere).

#syz invalid