Re: [Pv-drivers] general protection fault in vmci_host_poll

From: Dae R. Jeong
Date: Sat Mar 25 2023 - 02:58:25 EST


On Wed, Aug 10, 2022 at 06:36:02PM +0000, Nadav Amit wrote:
> >> - Crash report:
> >> general protection fault, probably for non-canonical address 0xdffffc000000000b: 0000 [#1] PREEMPT SMP KASAN
> >> KASAN: null-ptr-deref in range [0x0000000000000058-0x000000000000005f]
> >> Call Trace:
> >> <TASK>
> >> lock_acquire+0x1a4/0x4a0 kernel/locking/lockdep.c:5672
> >> __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
> >> _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
> >> spin_lock include/linux/spinlock.h:349 [inline]
> >> vmci_host_poll+0x16b/0x2b0 drivers/misc/vmw_vmci/vmci_host.c:177
> >> vfs_poll include/linux/poll.h:88 [inline]
> >> do_pollfd fs/select.c:873 [inline]
> >> do_poll fs/select.c:921 [inline]
> >> do_sys_poll+0xc7c/0x1aa0 fs/select.c:1015
> >> __do_sys_ppoll fs/select.c:1121 [inline]
> >> __se_sys_ppoll+0x2cc/0x330 fs/select.c:1101
> >> do_syscall_x64 arch/x86/entry/common.c:51 [inline]
> >> do_syscall_64+0x4e/0xa0 arch/x86/entry/common.c:82
>
> Not my module, so just sharing my 2 cents:
>
> It seems that this is a bug that is related to interaction between different
> debugging features, and it might not be related to VMCI. IIUC, KASAN is
> yelling at lock-dependency checker.
>
> The code that the failure points to is the entry to the lock_release(),
> which raises the question whether additional debug features were enabled
> during the failure, specifically ftrace function tracer or kprobes.
>

Hello,

This crash keeps occuring in our fuzzing environment, and we looked
into this. For me it seems that is caused by a race condition as
follows:

CPU1 CPU2
vmci_host_poll vmci_host_do_init_context
----- -----
// Read uninitialized context
context = vmci_host_dev->context;
// Initialize context
vmci_host_dev->context = vmci_ctx_create();
vmci_host_dev->ct_type = VMCIOBJ_CONTEXT;

if (vmci_host_dev->ct_type == VMCIOBJ_CONTEXT) {
// Dereferencing the wrong pointer
poll_wait(..., &context->host_context);
}

I think reading `context` after checking `ct_type` in vmci_host_poll()
should be enough to prevent this. Could you check this?

Best regards,
Dae R. Jeong