Re: [PATCH v2 05/13] KVM: arm64: Detect (via ACPI) and initialize HACDBSIRQ

From: Oliver Upton

Date: Tue Jun 30 2026 - 12:10:36 EST


On Tue, Jun 30, 2026 at 03:50:17PM +0100, Leonardo Bras wrote:
> On Mon, Jun 29, 2026 at 10:22:12AM -0700, Oliver Upton wrote:
> > If we need to initialize the IRQ I'd really like to see device tree
> > bindings for HACDBSIRQ as well. Pretty much any system us plebs can get
> > our hands on is gonna be DT anyway.
>
> Agree. I started out with ACPI because that's what the main target is, as
> dirty-logging is focused in Live Migration, which is usually more
> appreciated in the server space, which generally uses ACPI.
>
> I spoke to some people, and I could not hear of anyone releasing a product
> based in DT that would implement this yet, so I postponed the DT
> enablement.

Nested virt is always a good example. In some distant future KVM could
expose FEAT_HACDBS to the L1 hypervisor, and the VMM may be using DT
instead of ACPI (like kvmtool).

> >
> > > +static irqreturn_t hacdbsirq_handler(int irq, void *pcpu)
> > > +{
> > > + u64 cons = read_sysreg_s(SYS_HACDBSCONS_EL2);
> > > + unsigned long err = FIELD_GET(HACDBSCONS_EL2_ERR_REASON, cons);
> > > +
> > > + switch (err) {
> > > + case HACDBSCONS_EL2_ERR_REASON_NOF:
> > > + this_cpu_write(hacdbs_pcp.status, HACDBS_IDLE);
> > > + break;
> > > + case HACDBSCONS_EL2_ERR_REASON_IPAHACF:
> > > + /* When size not a power of two >= 4k, exit with reserved TTLW */
> > > + int index = FIELD_GET(HACDBSCONS_EL2_INDEX, cons);
> > > +
> > > + if (index >= this_cpu_read(hacdbs_pcp.size)) {
> > > + this_cpu_write(hacdbs_pcp.status, HACDBS_IDLE);
> > > + break;
> > > + }
> > > + fallthrough;
> > > + case HACDBSCONS_EL2_ERR_REASON_STRUCTF:
> > > + case HACDBSCONS_EL2_ERR_REASON_IPAF:
> > > + this_cpu_write(hacdbs_pcp.status, HACDBS_ERROR);
> > > + break;
> > > + }
> > > +
> > > + return IRQ_HANDLED;
> > > +}
> >
> > I have a pretty extreme distaste for creating a state machine between
> > the callsite and the IRQ handler. The callsite should poll HACDBS for
> > completion. The thread has nothing better to do anyway.
>
> Well, there is one argument it could just wait and save some energy, but I
> agree it is not relevant in server space.

I wouldn't suggest polling in a tight loop :) I'd say use something like
__mdelay() to get the core into a low-power state w/o using a naked WFI.
In fact, that already uses WFxT under the hood.

> The main reason I did this is
> because I am planning on later doing an improved version of this that would
> clean the dirty-bit *while* running the guest, and having the IRQ is needed
> for exiting guest so we can notify userspace the cleaning is done. So I
> laid the HACDBSIRQ infra here so we don't have both polling and IRQ options
> happening.
>
> That idea would require us to add new API (a return value for 'cleaned'),
> and also a new flag for the clean ioctl. We also need the VMM to
> implement that, but then we get a proper cpu usage of cleaning time.
>
> I wanted to start with a backwards compatible version, and do the above
> idea once I put my hands in hardware that implements HACDBS, so I can
> properly measure how much performance we get on above strategy.
>
> What do you think?

Yeah, I'd want to see some extremely compelling performance numbers for
this approach before considering it, alongside the necessary VMM patches
to actually activate it.

Seems likely to me that the VMM will want the background thread back
ASAP that calls the clean ioctl so you'll need to work out how to cope
with idle vCPUs in that case.

Even still, with this hypothetical approach I'd expect KVM to inspect
the HACDBS state on every exit. The IRQ is just a convenient kick back
out to the main KVM_RUN loop.

Thanks,
Oliver