On Thu, 14 Dec 2023 12:13:57 +0000,Sorry. The formatting is a bit messy.
Kunkun Jiang <jiangkunkun@xxxxxxxxxx> wrote:
Hi list,Guest userspace????
We have observed on GICv4.1 systems that, after a guest reset, the
secondary VCPU would receive an IPI_CPU_STOP accidently and failed to
come online eventually.
| Guest User-space
I have given it a go. I tested reseting the guest 100 times in succession,|This cannot clear the pending bits. Writing 0 to any of the
| reset (with a vSGI pending in the
| hardware) [0]
|
| disable the distributor (write 0
| into GICD_CTLR) [1]
|
| clear pending status (write 0 into
| GICR_ISPENDR0 for each RD) [2]
ISPEND/ICPEND registers is effectively a NOP.
If you want to remove the pending SGIs, you need to write a bunch of
1s to ICPENDR0.
|I don't understand what userspace does here. Why is it significant
| disable the distributor (write 0
| into GICD_CTLR) [3]
|
| enable the distributor with ARE,
| Group1 and nASSGIreq [4]
The problem is that even if user-space tries to disable the distributor
that it is an EL0 access? I don't know of any SW doing that, but even
if it existed, this should make no difference.
and clear pending bits for all SGIs, we don't actually propogate it intoBut this has *nothing* to do with the guest. This is the *host*
HW (we only record it via vgic_dist->{enabled, nassgireq} and
vgic_irq->pending_latch) and the vSGI remains pending across the guest
reset.
And when we're at [4], all vSGI's vgic_irq->hw are *true* and
vgic_v4_enable_vsgis() becomes a nop.. That's not good.
The following solution can solve the problem. Not sure if this is a good
solution.Sent out early for suggestions or solutions.
diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index 89117ba2528a..3678ab33f5b9 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -374,6 +374,10 @@ static int vgic_v3_uaccess_write_pending(struct
kvm_vcpu *vcpu,
irq->pending_latch = true;
vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
} else {
+ if (irq->hw && vgic_irq_is_sgi(irq->intid))
+ irq_set_irqchip_state(irq->host_irq,
+ IRQCHIP_STATE_PENDING,
+ false);
irq->pending_latch = false;
raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
}
userspace performing a write to the redistributor view, which has
different semantics. Which is why your earlier description made no
sense to me.
I think the problem is slightly larger than what you describe. A write
to ISPENDR0 should be propagated to the ITS for any values of the
latch, just like this happens on enabling HW-backed SGIs.
Can you please give this a go?
Thanks,
M.
>From 9932d74392d969057e84bc3c18bd15f1769b6211 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz@xxxxxxxxxx>
Date: Sun, 17 Dec 2023 11:15:09 +0000
Subject: [PATCH] KVM: arm64: vgic-v4: Restore pending state on host userspace
write
When the VMM writes to ISPENDR0 to set the state pending state of
an SGI, we fail to convey this to the HW if this SGI is already
backed by a GICv4.1 vSGI.
This is a bit of a corner case, as this would only occur if the
vgic state is changed on an already running VM, but this can
apparently happen across a guest reset driven by the VMM.
Fix this by always writing out the pending_latch value to the
HW, and reseting it to false.
Reported-by: Kunkun Jiang <jiangkunkun@xxxxxxxxxx>
Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
Link: https://lore.kernel.org/r/7e7f2c0c-448b-10a9-8929-4b8f4f6e2a32@xxxxxxxxxx
---
arch/arm64/kvm/vgic/vgic-mmio-v3.c | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index a764b0ab8bf9..2533f264b954 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -365,19 +365,26 @@ static int vgic_v3_uaccess_write_pending(struct kvm_vcpu *vcpu,
struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
raw_spin_lock_irqsave(&irq->irq_lock, flags);
- if (test_bit(i, &val)) {
- /*
- * pending_latch is set irrespective of irq type
- * (level or edge) to avoid dependency that VM should
- * restore irq config before pending info.
- */
- irq->pending_latch = true;
- vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
- } else {
+
+ /*
+ * pending_latch is set irrespective of irq type
+ * (level or edge) to avoid dependency that VM should
+ * restore irq config before pending info.
+ */
+ irq->pending_latch = test_bit(i, &val);
+
+ if (irq->hw && vgic_irq_is_sgi(irq->intid)) {
+ irq_set_irqchip_state(irq->host_irq,
+ IRQCHIP_STATE_PENDING,
+ irq->pending_latch);
irq->pending_latch = false;
- raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
}
+ if (irq->pending_latch)
+ vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
+ else
+ raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
+
vgic_put_irq(vcpu->kvm, irq);
}