Re: [PATCH] irqchip/gic-v4.1: Optimize the delay time of the poll on the GICR_VPENDBASER.Dirty bit
From: Marc Zyngier
Date: Wed Sep 16 2020 - 04:39:48 EST
On 2020-09-16 08:04, lushenming wrote:
Hi,
Our team just discussed this issue again and consulted our GIC hardware
design team. They think the RD can afford busy waiting. So we still
think
maybe 0 is better, at least for our hardware.
In addition, if not 0, as I said before, in our measurement, it takes
only
hundreds of nanoseconds, or 1~2 microseconds, to finish parsing the VPT
in most cases. So maybe 1 microseconds, or smaller, is more
appropriate.
Anyway, 10 microseconds is too much.
But it has to be said that it does depend on the hardware
implementation.
Exactly. And given that the only publicly available implementation is
a software model, I am reluctant to change "performance" related things
based on benchmarks that can't be verified and appears to me as a micro
optimization.
Besides, I'm not sure where are the start and end point of the total
scheduling
latency of a vcpu you said, which includes many events. Is the parse
time of
the VPT not clear enough?
Measure the time it takes from kvm_vcpu_load() to the point where the
vcpu
enters the guest. How much, in proportion, do these 1/2/10ms represent?
Also, a better(?) course of action would maybe to consider whether we
should
split the its_vpe_schedule() call into two distinct operations: one that
programs the VPE to be resident, and another that poll the Dirty bit
*much
later* on the entry path, giving the GIC a chance to work in parallel
with
the CPU on the entry path.
If your HW is a quick as you say it is, it would pretty much guarantee
a clear read of GICR_VPENDBASER without waiting.
M.
--
Jazz is not dead. It just smells funny...