Re: [RFC PATCH 0/3] generic hypercall support

From: Paul E. McKenney
Date: Fri May 08 2009 - 12:49:01 EST

Next message: Cyrill Gorcunov: "Re: [PATCH 2/2] SLUB: Use GFP_PANIC for early-boot allocations"
Previous message: Christoph Lameter: "Re: [PATCH 2/2] SLUB: Use GFP_PANIC for early-boot allocations"
In reply to: Avi Kivity: "Re: [RFC PATCH 0/3] generic hypercall support"
Next in thread: Gregory Haskins: "Re: [RFC PATCH 0/3] generic hypercall support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, May 08, 2009 at 08:43:40AM -0400, Gregory Haskins wrote:
> Marcelo Tosatti wrote:
> > On Fri, May 08, 2009 at 10:59:00AM +0300, Avi Kivity wrote:
> >
> >> Marcelo Tosatti wrote:
> >>
> >>> I think comparison is not entirely fair. You're using
> >>> KVM_HC_VAPIC_POLL_IRQ ("null" hypercall) and the compiler optimizes that
> >>> (on Intel) to only one register read:
> >>>
> >>> nr = kvm_register_read(vcpu, VCPU_REGS_RAX);
> >>>
> >>> Whereas in a real hypercall for (say) PIO you would need the address,
> >>> size, direction and data.
> >>>
> >>>
> >> Well, that's probably one of the reasons pio is slower, as the cpu has
> >> to set these up, and the kernel has to read them.
> >>
> >>
> >>> Also for PIO/MMIO you're adding this unoptimized lookup to the
> >>> measurement:
> >>>
> >>> pio_dev = vcpu_find_pio_dev(vcpu, port, size, !in);
> >>> if (pio_dev) {
> >>> kernel_pio(pio_dev, vcpu, vcpu->arch.pio_data);
> >>> complete_pio(vcpu); return 1;
> >>> }
> >>>
> >>>
> >> Since there are only one or two elements in the list, I don't see how it
> >> could be optimized.
> >>
> >
> > speaker_ioport, pit_ioport, pic_ioport and plus nulldev ioport. nulldev
> > is probably the last in the io_bus list.
> >
> > Not sure if this one matters very much. Point is you should measure the
> > exit time only, not the pio path vs hypercall path in kvm.
> >
>
> The problem is the exit time in of itself isnt all that interesting to
> me. What I am interested in measuring is how long it takes KVM to
> process the request and realize that I want to execute function "X".
> Ultimately that is what matters in terms of execution latency and is
> thus the more interesting data. I think the exit time is possibly an
> interesting 5th data point, but its more of a side-bar IMO. In any
> case, I suspect that both exits will be approximately the same at the
> VT/SVM level.
>
> OTOH: If there is a patch out there to improve KVMs code (say
> specifically the PIO handling logic), that is fair-game here and we
> should benchmark it. For instance, if you have ideas on ways to improve
> the find_pio_dev performance, etc.... One item may be to replace the
> kvm->lock on the bus scan with an RCU or something.... (though PIOs are
> very frequent and the constant re-entry to an an RCU read-side CS may
> effectively cause a perpetual grace-period and may be too prohibitive).
> CC'ing pmck.

Hello, Greg!

Not a problem. ;-)

A grace period only needs to wait on RCU read-side critical sections that
started before the grace period started. As soon as those pre-existing
RCU read-side critical get done, the grace period can end, regardless
of how many RCU read-side critical sections might have started after
the grace period started.

If you find a situation where huge numbers of RCU read-side critical
sections do indefinitely delay a grace period, then that is a bug in
RCU that I need to fix.

Of course, if you have a single RCU read-side critical section that
runs for a very long time, that -will- delay a grace period. As long
as you don't do it too often, this is not a problem, though if running
a single RCU read-side critical section for more than a few milliseconds
is probably not a good thing. Not as bad as holding a heavily contended
spinlock for a few milliseconds, but still not a good thing.

Thanx, Paul

> FWIW: the PIOoHCs were about 140ns slower than pure HC, so some of that
> 140 can possibly be recouped. I currently suspect the lock acquisition
> in the iobus-scan is the bulk of that time, but that is admittedly a
> guess. The remaining 200-250ns is elsewhere in the PIO decode.
>
> -Greg
>
>
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Cyrill Gorcunov: "Re: [PATCH 2/2] SLUB: Use GFP_PANIC for early-boot allocations"
Previous message: Christoph Lameter: "Re: [PATCH 2/2] SLUB: Use GFP_PANIC for early-boot allocations"
In reply to: Avi Kivity: "Re: [RFC PATCH 0/3] generic hypercall support"
Next in thread: Gregory Haskins: "Re: [RFC PATCH 0/3] generic hypercall support"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]