get.org get.opt switch.opt
slots[7].len=32768 278379 66398 64024
slots[8].len=32768 181246 270 160
slots[7].len=32768 263961 64673 64494
slots[8].len=32768 181655 265 160
slots[7].len=32768 263736 64701 64610
slots[8].len=32768 182785 267 160
slots[7].len=32768 260925 65360 65042
slots[8].len=32768 182579 264 160
slots[7].len=32768 267823 65915 65682
slots[8].len=32768 186350 271 160
At a glance, we know our optimization improved significantly compared
to the original get dirty log ioctl. This is true for both get.opt and
switch.opt. This has a really big impact for the personal KVM users who
drive KVM in GUI mode on their usual PCs.
Next, we notice that switch.opt improved a hundred nano seconds or so for
these slots. Although this may sound a bit tiny improvement, we can feel
this as a difference of GUI's responses like mouse reactions.
100 ns... this is a bit on the low side (and if you can measure it
interactively you have much better reflexes than I).
To feel the difference, please try GUI on your PC with our patch series!
No doubt get.org -> get.opt is measurable, but get.opt->switch.opt is
problematic. Have you tried profiling to see where the time is spent
(well I can guess, clearing the write access from the sptes).
In usual workload, the number of dirty pages varies a lot for each
iteration
and we should gain really a lot for relatively clean cases.
Can you post such a test, for an idle large guest?