Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue

From: Stefano Garzarella
Date: Tue May 30 2023 - 12:18:10 EST


On Tue, May 30, 2023 at 11:09:09AM -0500, Mike Christie wrote:
On 5/30/23 11:00 AM, Stefano Garzarella wrote:
I think it is partially related to commit 6e890c5d5021 ("vhost: use
vhost_tasks for worker threads") and commit 1a5f8090c6de ("vhost: move
worker thread fields to new struct"). Maybe that commits just
highlighted the issue and it was already existing.

See my mail about the crash. Agree with your analysis about worker->vtsk
not being set yet. It's a bug from my commit where I should have not set
it so early or I should be checking for

if (dev->worker && worker->vtsk)

instead of

if (dev->worker)

Yes, though, in my opinion the problem may persist depending on how the
instructions are reordered.

Should we protect dev->worker() with an RCU to be safe?


One question about the behavior before my commit though and what we want in
the end going forward. Before that patch we would just drop work if
vhost_work_queue was called before VHOST_SET_OWNER. Was that correct/expected?

I think so, since we ask the guest to call VHOST_SET_OWNER, before any
other command.


The call to vhost_work_queue in vhost_vsock_start was only seeing the
works queued after VHOST_SET_OWNER. Did you want works queued before that?


Yes, for example if an application in the host has tried to connect and
is waiting for a timeout, we already have work queued up to flush as
soon as we start the device. (See commit 0b841030625c ("vhost: vsock:
kick send_pkt worker once device is started")).

Thanks,
Stefano