Re: [PATCH v2] KVM: x86: Rate-limit global clock updates on vCPU load

From: Jaroslav Pulchart

Date: Wed May 06 2026 - 12:07:47 EST


>
> On Wed, May 06, 2026, Thorsten Leemhuis wrote:
> > On 5/6/26 14:55, Sean Christopherson wrote:
> > > On Wed, May 06, 2026, Thorsten Leemhuis wrote:
> > >> On 4/9/26 21:21, Sean Christopherson wrote:
> > >>> On Thu, Apr 09, 2026, Lei Chen wrote:
> > >>>> commit 446fcce2a52b ("Revert "x86: kvm: rate-limit global clock updates"")
> > >>>> dropped the rate limiting for KVM_REQ_GLOBAL_CLOCK_UPDATE.
> > >>>>
> > >>>> As a result, kvm_arch_vcpu_load() can queue global clock update requests
> > >>>> every time a vCPU is scheduled when the master clock is disabled or when
> > >>>> the vCPU is loaded for the first time.
> > >>>>
> > >>>> Restore the throttling with a per-VM ratelimit state and gate
> > >>>> KVM_REQ_GLOBAL_CLOCK_UPDATE through __ratelimit(), so frequent vCPU
> > >>>> scheduling does not generate a steady stream of redundant clock update
> > >>>> requests.
> > >>>>
> > >>>> Fixes: 446fcce2a52b ("Revert "x86: kvm: rate-limit global clock updates"")
> > >>>> Signed-off-by: Lei Chen <lei.chen@xxxxxxxxxx>
> > >>>> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@xxxxxxxxxxxx>
> > >>>> Closes: https://lore.kernel.org/all/CAK8fFZ5gY8_Mw2A=iZVFNVKQNrXQzVsn-HTd+Me9K6ZfmdgA+Q@xxxxxxxxxxxxxx/
> > >>
> > >> Was this performance regression ever addressed?
> > > Nope, not yet.
> > >
> > >> Looks like this fall through the cracks, but it's easy to miss something.
> > >
> > > It's in my list of patches to apply (probably for 7.2?). I didn't want to squeeze
> > > it into the initial 7.1 pull request for a variety of reasons.
> >
> > Hmmm. CCing Linus so he can speak up if he wants to about the following:
> >
> > Given that this is a fix for a performance regression[1] I'd say it's
> > not as urgent as a "something stopped working" case -- so I guess it's
> > something where the "[fix] "within a week", preferably before the next
> > rc" approach Linus recently mentioned does not need to be applied strictly.
> >
> > But Jaroslav OTOH reported it more than 7 weeks ago already and back
> > then called it something that "severely impacts KVM hosts running many
> > Firecracker microVMs."[1];
>
> For a setup that is likely broken. On modern hardware, the path in question
> should never actually be hit. I do want to resolve the bug since older hardware
> and funky setups do rely on the old behavior, but it's not pants-on-fire urgent.
>
> More importantly, the original reporter(s) hasn't responded to any of our questions,
> or to the proposed fix. I'm not going to rush in a fix if I don't actually *know*
> it's going to fix the original problem.

Hi Sean, Thorsten,

sorry for the missing response from my side, this thread unfortunately
ended up in trash due to mail filters on my side and I completely
missed it. I currently don't have the full context loaded back in yet,
but I'll re-read the thread and follow up properly once I do.

For additional context, we are currently running the latest 6.19/7.0.y
kernels with a revert of the commits causing the reported regression,
and the hardware is AMD EPYC 9454P 48-Core Processor.

Jaroslav

>
> > and a potential fix exists for 4 weeks already. Due to that, 7.2 feels a bit
> > too far away for me, as that is still ~15 weeks away. But maybe that's just
> > me.
>
> The "user" is also a fairly sizeable company, not some random person that's trying
> to use KVM and is blocked. I highly doubt they are still actually running a buggy
> kernel. E.g. based on a "same workload after rollback" comment in the bug report,
> I assume they simply rolled back to the last good kernel (6.18).
>
> Who knows, maybe they also took our hints/suggestions about theire setup being
> wonky and addressed whatever hiccup was sending them down the uncommon, already-
> slow path.
>
> All in all, AFAICT the only difference between sending this into 7.1 vs. 7.2 is
> that the reporter won't be able to upgrade their kernel (without patching) for an
> extra ~8 weeks.