Re: [PATCH] vgaarb: fix signal handling in vga_get()

From: David Herrmann
Date: Thu Dec 10 2015 - 05:29:13 EST


Hi

On Mon, Nov 30, 2015 at 3:17 AM, Kirill A. Shutemov
<kirill@xxxxxxxxxxxxx> wrote:
> There are few defects in vga_get() related to signal hadning:
>
> - we shouldn't check for pending signals for TASK_UNINTERRUPTIBLE
> case;
>
> - if we found pending signal we must remove ourself from wait queue
> and change task state back to running;
>
> - -ERESTARTSYS is more appropriate, I guess.
>
> Signed-off-by: Kirill A. Shutemov <kirill@xxxxxxxxxxxxx>
> ---
>
> Alex, I try to get KVM with VGA passthrough working properly. I have i915
> (HD 4600) on the host and GTX 580 for the guest. The guest GPU is not
> capabale of EFI, so I have to use x-vga=on. It's kinda work with your
> patch for i915.enable_hd_vgaarb=1. But guest refuse to initialize the GPU
> after KVM was not shut down correctly, resulting in host crash like this:
>
> BUG: unable to handle kernel paging request at ffff880870187ed8
> IP: [<ffff880870187ed8>] 0xffff880870187ed8
> PGD 2129067 PUD 80000008400001e3
> Oops: 0011 [#1] PREEMPT SMP
> Modules linked in: iwlmvm iwlwifi
> CPU: 6 PID: 3983 Comm: qemu-system-x86 Not tainted 4.3.0-gentoo #6
> Hardware name: Gigabyte Technology Co., Ltd. Z87X-UD7 TH/Z87X-UD7 TH-CF, BIOS F5a 06/12/2014
> task: ffff88087a910000 ti: ffff8808632c0000 task.ti: ffff8808632c0000
> RIP: 0010:[<ffff880870187ed8>] [<ffff880870187ed8>] 0xffff880870187ed8
> RSP: 0018:ffff8808632c3d08 EFLAGS: 00010006
> RAX: ffff880870187db0 RBX: 0000000070187f58 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff880870187db0
> RBP: ffff8808632c3d48 R08: 0000000000000000 R09: 0000000000000000
> R10: 00000000000103c0 R11: 0000000000000293 R12: ffffffff81ea03c8
> R13: ffffffff8104c7cb R14: 0000000000000000 R15: 0000000000000003
> FS: 00007f984f9b2700(0000) GS:ffff88089f380000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffff880870187ed8 CR3: 00000008645f8000 CR4: 00000000001426e0
> Stack:
> ffffffff810cc83d 00000000632c3d28 0000000000000000 ffffffff81ea03c0
> 0000000000000046 0000000000000003 0000000000000000 0000000000000000
> ffff8808632c3d80 ffffffff810cca44 ffff88087af63800 0000000000000286
> Call Trace:
> [<ffffffff810cc83d>] ? __wake_up_common+0x4d/0x80
> [<ffffffff810cca44>] __wake_up+0x34/0x50
> [<ffffffff815d99e3>] __vga_put+0x73/0xd0
> [<ffffffff815d9db4>] vga_put+0x54/0x80
> [<ffffffff8169d042>] vfio_pci_vga_rw+0x1d2/0x220
> [<ffffffff8169a7f3>] vfio_pci_rw+0x33/0x60
> [<ffffffff8169abf7>] vfio_pci_write+0x17/0x20
> [<ffffffff816966a6>] vfio_device_fops_write+0x26/0x30
> [<ffffffff811a4b23>] __vfs_write+0x23/0xe0
> [<ffffffff811a4a53>] ? __vfs_read+0x23/0xd0
> [<ffffffff811b6e35>] ? do_vfs_ioctl+0x2b5/0x490
> [<ffffffff811a5194>] vfs_write+0xa4/0x190
> [<ffffffff811a5fa6>] SyS_pwrite64+0x66/0xa0
> [<ffffffff819a17d7>] entry_SYSCALL_64_fastpath+0x12/0x6a
> Code: 88 ff ff e0 7e 18 70 08 88 ff ff 00 8c 57 76 08 88 ff ff 20 7f 18 70 08 88 ff ff 08 7f 18 70 08 88 ff ff 94 51 1a 81 ff ff ff ff <09> 00 00 00 00 00 00 00 01 8c 57 76 08 88 ff ff 00 8c 57 76 08
> RIP [<ffff880870187ed8>] 0xffff880870187ed8
> RSP <ffff8808632c3d08>
> CR2: ffff880870187ed8
>
> The patch fixes the crash, but doesn't help with getting GPU in guest
> working again.
>
> Any ideas?
>
> ---
> drivers/gpu/vga/vgaarb.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
> index 3166e4bc4eb6..9abcaa53bd25 100644
> --- a/drivers/gpu/vga/vgaarb.c
> +++ b/drivers/gpu/vga/vgaarb.c
> @@ -395,8 +395,10 @@ int vga_get(struct pci_dev *pdev, unsigned int rsrc, int interruptible)
> set_current_state(interruptible ?
> TASK_INTERRUPTIBLE :
> TASK_UNINTERRUPTIBLE);
> - if (signal_pending(current)) {
> - rc = -EINTR;
> + if (interruptible && signal_pending(current)) {
> + __set_current_state(TASK_RUNNING);
> + remove_wait_queue(&vga_wait_queue, &wait);
> + rc = -ERESTARTSYS;
> break;

All 3 points are valid, and the patch looks good to me:

Reviewed-by: David Herrmann <dh.herrmann@xxxxxxxxx>

However, there seems to be a race between vga_lock and putting the
thread asleep. We should fix that as well. See the hunk below
(completely untested.. why is VGA still in use? *sigh*).

Thanks
David

diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
index a0b4334..82cf1e3 100644
--- a/drivers/gpu/vga/vgaarb.c
+++ b/drivers/gpu/vga/vgaarb.c
@@ -359,8 +359,8 @@ static void __vga_put
int vga_get(struct pci_dev *pdev, unsigned int rsrc, int interruptible)
{
struct vga_device *vgadev, *conflict;
+ DECLARE_WAITQUEUE(wait, current);
unsigned long flags;
- wait_queue_t wait;
int rc = 0;

vga_check_first_use();
@@ -371,6 +371,11 @@ int vga_get
return 0;

for (;;) {
+ add_wait_queue(&vga_wait_queue, &wait);
+ set_current_state(interruptible ?
+ TASK_INTERRUPTIBLE :
+ TASK_UNINTERRUPTIBLE);
+
spin_lock_irqsave(&vga_lock, flags);
vgadev = vgadev_find(pdev);
if (vgadev == NULL) {
@@ -383,25 +388,22 @@ int vga_get(struct pci_dev *pdev, unsigned int
rsrc, int interruptible)
if (conflict == NULL)
break;

-
/* We have a conflict, we wait until somebody kicks the
* work queue. Currently we have one work queue that we
* kick each time some resources are released, but it would
* be fairly easy to have a per device one so that we only
* need to attach to the conflicting device
*/
- init_waitqueue_entry(&wait, current);
- add_wait_queue(&vga_wait_queue, &wait);
- set_current_state(interruptible ?
- TASK_INTERRUPTIBLE :
- TASK_UNINTERRUPTIBLE);
- if (signal_pending(current)) {
- rc = -EINTR;
+ if (interruptible && signal_pending(current)) {
+ rc = -ERESTARTSYS;
break;
}
schedule();
remove_wait_queue(&vga_wait_queue, &wait);
}
+
+ __set_current_state(TASK_RUNNING);
+ remove_wait_queue(&vga_wait_queue, &wait);
return rc;
}
EXPORT_SYMBOL(vga_get);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/