One thing I was thinking here was that I could create a flag for the
kvm_irqfd() function for something like "KVM_IRQFD_MODE_CLEAR". This
flag when specified at creation time will cause the event to execute a
clear operation instead of a set when triggered. That way, the default
mode is an edge-triggered set. The non-default mode is to trigger a
clear. Level-triggered ints could therefore create two irqfds, one for
raising, the other for clearing.
An alternative is to abandon the use of eventfd, and allow the irqfd to
be a first-class anon-fd. The parameters passed to the write/signal()
function could then indicate the desired level. The disadvantage would
be that it would not be compatible with eventfd, so we would need to
decide if the tradeoff is worth it.
OTOH, I suspect level triggered interrupts will be primarily in the
legacy domain, so perhaps we do not need to worry about it too much. Therefore, another option is that we *could* simply set the stake in the
ground that legacy/level cannot use irqfd.