On Thu, Dec 14, 2017 at 12:57:54PM +0800, Jia He wrote:Please see the WARN_ON result (without my patch)
Hi Jia,
I have tried your newer level-mapped-v7 branch, but bug is still there.Yes, I think there's something there if the hardware doesn't retire the
There is no special load in both host and guest. The guest (kernel
4.14) is often hanging when booting
the guest kernel log
[ OK ] Reached target Remote File Systems.
Starting File System Check on /dev/mapper/fedora-root...
[ OK ] Started File System Check on /dev/mapper/fedora-root.
Mounting /sysroot...
[ 2.670764] SGI XFS with ACLs, security attributes, no debug enabled
[ 2.678180] XFS (dm-0): Mounting V5 Filesystem
[ 2.740364] XFS (dm-0): Ending clean mount
[ OK ] Mounted /sysroot.
[ OK ] Reached target Initrd Root File System.
Starting Reload Configuration from the Real Root...
[ 61.288215] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 61.290791] 1-...!: (0 ticks this GP) idle=574/0/0 softirq=5/5 fqs=1
[ 61.293664] (detected by 0, t=6002 jiffies, g=-263, c=-264, q=39760)
[ 61.296480] Task dump for CPU 1:
[ 61.297938] swapper/1 R running task 0 0 1 0x00000020
[ 61.300643] Call trace:
[ 61.301260] __switch_to+0x6c/0x78
[ 61.302095] cpu_number+0x0/0x8
[ 61.302867] rcu_sched kthread starved for 6000 jiffies!
g18446744073709551353 c18446744073709551352 f0x0 RCU_GP_WAIT_FQS(3)
->state=0x402 ->cpu=1
[ 61.305941] rcu_sched I 0 8 2 0x00000020
[ 61.307250] Call trace:
[ 61.307854] __switch_to+0x6c/0x78
[ 61.308693] __schedule+0x268/0x8f0
[ 61.309545] schedule+0x2c/0x88
[ 61.310325] schedule_timeout+0x84/0x3b8
[ 61.311278] rcu_gp_kthread+0x4d4/0x7d8
[ 61.312213] kthread+0x134/0x138
[ 61.313001] ret_from_fork+0x10/0x1c
Maybe my previous patch is not perfect enough, thanks for your comments.
I digged it futher more, do you think below code logic is possibly
problematic?
vtimer_save_stateÂÂÂÂÂÂÂÂÂÂ (vtimer->loaded = false, cntv_ctl is 0)
kvm_arch_timer_handlerÂÂÂÂÂÂÂÂ(read cntv_ctl and set vtimer->cnt_ctl = 0)
vtimer_restore_state     Â (write vtimer->cnt_ctl to cntv_ctl,
then cntv_ctl will
ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂ Â Â be 0 forever)
If above analysis is reasonable
signal fast enough...
how about below patch? alreadyThere's nothing really wrong with that patch, I just didn't think it
tested in my arm64 server.
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index f9555b1..ee6dd3f 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -99,7 +99,7 @@ static irqreturn_t kvm_arch_timer_handler(int irq,
void *dev_id)
ÂÂÂÂÂÂÂ }
ÂÂÂÂÂÂÂ vtimer = vcpu_vtimer(vcpu);
-ÂÂÂÂÂÂ if (!vtimer->irq.level) {
+ÂÂÂÂÂÂ if (vtimer->loaded && !vtimer->irq.level) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ vtimer->cnt_ctl = read_sysreg_el0(cntv_ctl);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (kvm_timer_irq_can_fire(vtimer))
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kvm_timer_update_irq(vcpu, true, vtimer);
would be necessary, as we really shouldn't see interrupts if the timer
is not loaded. Can you confirm that a WARN_ON(!vtimer->loaded) in
kvm_arch_timer_handler() gives you a splat?
Also, could you give the following a try (without your patch):No luckï the bug is still there
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 73d262c4712b..4751255345d1 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -367,6 +367,7 @@ static void vtimer_save_state(struct kvm_vcpu *vcpu)
/* Disable the virtual timer */
write_sysreg_el0(0, cntv_ctl);
+ isb();