Re: [PATCH] smp/call: Detect stuck CSD locks

From: Ingo Molnar
Date: Mon May 11 2015 - 10:00:15 EST



* Chris J Arges <chris.j.arges@xxxxxxxxxxxxx> wrote:

> Later in the trace we see the same call followed by
> vmx_handle_external_intr() ignoring the call:
>
> [ 603.248016] 2452.083823 | 0) | ptep_clear_flush() {
> [ 603.248016] 2452.083824 | 0) | flush_tlb_page() {
> [ 603.248016] 2452.083824 | 0) 0.109 us | leave_mm();
> [ 603.248016] 2452.083824 | 0) | native_flush_tlb_others() {
> [ 603.248016] 2452.083824 | 0) | smp_call_function_many() {
> [ 603.248016] 2452.083825 | 0) | smp_call_function_single() {
> [ 603.248016] 2452.083825 | 0) | generic_exec_single() {
> [ 603.248016] 2452.083825 | 0) | native_send_call_func_single_ipi() {
> [ 603.248016] 2452.083825 | 0) | x2apic_send_IPI_mask() {
> [ 603.248016] 2452.083826 | 0) 1.625 us | __x2apic_send_IPI_mask();
> [ 603.248016] 2452.083828 | 0) 2.173 us | }
> [ 603.248016] 2452.083828 | 0) 2.588 us | }
> [ 603.248016] 2452.083828 | 0) 3.082 us | }
> [ 603.248016] 2452.083828 | 0) | csd_lock_wait.isra.4() {
> [ 603.248016] 2452.083848 | 1) + 44.033 us | }
> [ 603.248016] 2452.083849 | 1) 0.975 us | vmx_read_l1_tsc();
> [ 603.248016] 2452.083851 | 1) 1.031 us | vmx_handle_external_intr();
> [ 603.248016] 2452.083852 | 1) 0.234 us | __srcu_read_lock();
> [ 603.248016] 2452.083853 | 1) | vmx_handle_exit() {
> [ 603.248016] 2452.083854 | 1) | handle_ept_violation() {
> [ 603.248016] 2452.083856 | 1) | kvm_mmu_page_fault() {
> [ 603.248016] 2452.083856 | 1) | tdp_page_fault() {
> [ 603.248016] 2452.083856 | 1) 0.092 us | mmu_topup_memory_caches();
> [ 603.248016] 2452.083857 | 1) | gfn_to_memslot_dirty_bitmap.isra.84() {
> [ 603.248016] 2452.083857 | 1) 0.231 us | gfn_to_memslot();
> [ 603.248016] 2452.083858 | 1) 0.774 us | }
>
> So potentially, CPU0 generated an interrupt that caused
> vcpu_enter_guest to be called on CPU1. However, when
> vmx_handle_external_intr was called, it didn't progress any further.

So the IPI does look like to be lost in the KVM code?

So why did vmx_handle_external_intr() skip the irq injection - were
IRQs disabled in the guest perhaps?

> Another experiment here would be to dump
> vmcs_read32(VM_EXIT_INTR_INFO); to see why we don't handle the
> interrupt.

Possibly, but also to instrument the KVM IRQ injection code to see
when it skips an IPI and why.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/