Re: [æéäåèååéé] Re: [RFC PATCH] KVM: arm/arm64: vgic: change condition for level interrupt resampling

From: Yang, Shunyong
Date: Thu Mar 08 2018 - 04:31:50 EST


Hi, Eric,

First, please let me changeÂChristoffer's email toÂcdall@xxxxxxxxxxx I
add more information about my test below, please check.

On Thu, 2018-03-08 at 09:57 +0100, Auger Eric wrote:
> Hi,
>
> On 08/03/18 08:01, Shunyong Yang wrote:
> >
> > When resampling irqfds is enabled, level interrupt should be
> > de-asserted when resampling happens. On page 4-47 of GIC v3
> > specification IHI0069D, it said,
> > "When the PE acknowledges an SGI, a PPI, or an SPI at the CPU
> > interface, the IRI changes the status of the interrupt to active
> > and pending if:
> > â It is an edge-triggered interrupt, and another edge has been
> > detected since the interrupt was acknowledged.
> > â It is a level-sensitive interrupt, and the level has not been
> > deasserted since the interrupt was acknowledged."
> >
> > GIC v2 specification IHI0048B.b has similar description on page
> > 3-42 for state machine transition.
> >
> > When some VFIO device, like mtty(8250 VFIO mdev emulation driver
> > in samples/vfio-mdev) triggers a level interrupt, the status
> > transition in LR is pending-->active-->active and pending.
> > Then it will wait resampling to de-assert the interrupt.
> >
> > Current design of lr_signals_eoi_mi() will return false if state
> > in LR is not invalid(Inactive). It causes resampling will not
> > happen
> > in mtty case.
> >
> > This will cause interrupt fired continuously to guest even 8250 IIR
> > has no interrupt. When 8250's interrupt is configured in shared
> > mode,
> > it will pass interrupt to other drivers to handle. However, there
> > is no other driver involved. Then, a "nobody cared" kernel
> > complaint
> > occurs.
> >
> > / # cat /dev/ttyS0
> > [ÂÂÂÂ4.826836] random: crng init done
> > [ÂÂÂÂ6.373620] irq 41: nobody cared (try booting with the "irqpoll"
> > option)
> > [ÂÂÂÂ6.376414] CPU: 0 PID: 1307 Comm: cat Not tainted 4.16.0-rc4 #4
> > [ÂÂÂÂ6.378927] Hardware name: linux,dummy-virt (DT)
> > [ÂÂÂÂ6.380876] Call trace:
> > [ÂÂÂÂ6.381937]ÂÂdump_backtrace+0x0/0x180
> > [ÂÂÂÂ6.383495]ÂÂshow_stack+0x14/0x1c
> > [ÂÂÂÂ6.384902]ÂÂdump_stack+0x90/0xb4
> > [ÂÂÂÂ6.386312]ÂÂ__report_bad_irq+0x38/0xe0
> > [ÂÂÂÂ6.387944]ÂÂnote_interrupt+0x1f4/0x2b8
> > [ÂÂÂÂ6.389568]ÂÂhandle_irq_event_percpu+0x54/0x7c
> > [ÂÂÂÂ6.391433]ÂÂhandle_irq_event+0x44/0x74
> > [ÂÂÂÂ6.393056]ÂÂhandle_fasteoi_irq+0x9c/0x154
> > [ÂÂÂÂ6.394784]ÂÂgeneric_handle_irq+0x24/0x38
> > [ÂÂÂÂ6.396483]ÂÂ__handle_domain_irq+0x60/0xb4
> > [ÂÂÂÂ6.398207]ÂÂgic_handle_irq+0x98/0x1b0
> > [ÂÂÂÂ6.399796]ÂÂel1_irq+0xb0/0x128
> > [ÂÂÂÂ6.401138]ÂÂ_raw_spin_unlock_irqrestore+0x18/0x40
> > [ÂÂÂÂ6.403149]ÂÂ__setup_irq+0x41c/0x678
> > [ÂÂÂÂ6.404669]ÂÂrequest_threaded_irq+0xe0/0x190
> > [ÂÂÂÂ6.406474]ÂÂuniv8250_setup_irq+0x208/0x234
> > [ÂÂÂÂ6.408250]ÂÂserial8250_do_startup+0x1b4/0x754
> > [ÂÂÂÂ6.410123]ÂÂserial8250_startup+0x20/0x28
> > [ÂÂÂÂ6.411826]ÂÂuart_startup.part.21+0x78/0x144
> > [ÂÂÂÂ6.413633]ÂÂuart_port_activate+0x50/0x68
> > [ÂÂÂÂ6.415328]ÂÂtty_port_open+0x84/0xd4
> > [ÂÂÂÂ6.416851]ÂÂuart_open+0x34/0x44
> > [ÂÂÂÂ6.418229]ÂÂtty_open+0xec/0x3c8
> > [ÂÂÂÂ6.419610]ÂÂchrdev_open+0xb0/0x198
> > [ÂÂÂÂ6.421093]ÂÂdo_dentry_open+0x200/0x310
> > [ÂÂÂÂ6.422714]ÂÂvfs_open+0x54/0x84
> > [ÂÂÂÂ6.424054]ÂÂpath_openat+0x2dc/0xf04
> > [ÂÂÂÂ6.425569]ÂÂdo_filp_open+0x68/0xd8
> > [ÂÂÂÂ6.427044]ÂÂdo_sys_open+0x16c/0x224
> > [ÂÂÂÂ6.428563]ÂÂSyS_openat+0x10/0x18
> > [ÂÂÂÂ6.429972]ÂÂel0_svc_naked+0x30/0x34
> > [ÂÂÂÂ6.431494] handlers:
> > [ÂÂÂÂ6.432479] [<000000000e9fb4bb>] serial8250_interrupt
> > [ÂÂÂÂ6.434597] Disabling IRQ #41
> >
> > This patch changes the lr state condition in lr_signals_eoi_mi()
> > from
> > invalid(Inactive) to active and pending to avoid this.
> >
> > I am not sure about the original design of the condition of
> > invalid(active). So, This RFC is sent out for comments.
> >
> > Cc: Joey Zheng <yu.zheng@xxxxxxxxxxxxxxxx>
> > Signed-off-by: Shunyong Yang <shunyong.yang@xxxxxxxxxxxxxxxx>
> > ---
> > Âvirt/kvm/arm/vgic/vgic-v2.c | 4 ++--
> > Âvirt/kvm/arm/vgic/vgic-v3.c | 4 ++--
> > Â2 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-
> > v2.c
> > index e9d840a75e7b..740ee9a5f551 100644
> > --- a/virt/kvm/arm/vgic/vgic-v2.c
> > +++ b/virt/kvm/arm/vgic/vgic-v2.c
> > @@ -46,8 +46,8 @@ void vgic_v2_set_underflow(struct kvm_vcpu *vcpu)
> > Â
> > Âstatic bool lr_signals_eoi_mi(u32 lr_val)
> > Â{
> > - return !(lr_val & GICH_LR_STATE) && (lr_val & GICH_LR_EOI)
> > &&
> > - ÂÂÂÂÂÂÂ!(lr_val & GICH_LR_HW);
> > + return !((lr_val & GICH_LR_STATE) ^ GICH_LR_STATE) &&
> > + ÂÂÂÂÂÂÂ(lr_val & GICH_LR_EOI) && !(lr_val & GICH_LR_HW);
> > Â}
> > Â
> > Â/*
> > diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-
> > v3.c
> > index 6b329414e57a..43111bba7af9 100644
> > --- a/virt/kvm/arm/vgic/vgic-v3.c
> > +++ b/virt/kvm/arm/vgic/vgic-v3.c
> > @@ -35,8 +35,8 @@ void vgic_v3_set_underflow(struct kvm_vcpu *vcpu)
> > Â
> > Âstatic bool lr_signals_eoi_mi(u64 lr_val)
> > Â{
> > - return !(lr_val & ICH_LR_STATE) && (lr_val & ICH_LR_EOI)
> > &&
> > - ÂÂÂÂÂÂÂ!(lr_val & ICH_LR_HW);
> > + return !((lr_val & ICH_LR_STATE) ^ ICH_LR_STATE) &&
> > + ÂÂÂÂÂÂÂ(lr_val & ICH_LR_EOI) && !(lr_val & ICH_LR_HW);
>
> In general don't we have this state transition
>
> inactive -> pending -> pending + active (1) -> active -> inactive.
>
> In that case won't we lower the virt irq level when folding the LR on
> Pending + Active state, which is not was we want?
>
> Thanks
>
> Eric

In current code, in my test, when I output LR value of the mtty IRQ 41
(hwirq = 36) inÂvgic_v3_fold_lr_state(). The LR's transition starts
like following,

0-->50a0020000000024-->90a0020000000024-->d0a0020000000024

That is inactive-->pending-->active-->pending + active.
Then it keep running cyclic pending-->active-->pending + active.

The level interrupt de-assert should happen in following code
/* Notify fds when the guest EOI'ed a level-triggered IRQ */
if (lr_signals_eoi_mi(val) && vgic_valid_spi(vcpu->kvm, intid))
kvm_notify_acked_irq(vcpu->kvm, 0,
ÂÂÂÂÂintid - VGIC_NR_PRIVATE_IRQS);

But as addressed in commit message, lr_signals_eoi_mi() will return
false if state in LR is not invalid(inactive), so it has no chance to
de-assert the level interrupt in my test.Â

Thanks.
Shunyong.

>
> >
> > Â}
> > Â
> > Âvoid vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
> >