Re: ...\n

From: Amit Shah
Date: Wed Jun 01 2022 - 05:44:38 EST


On Wed, 2022-06-01 at 08:52 +0200, Peter Zijlstra wrote:
> On Tue, May 31, 2022 at 02:52:04PM +0000, Durrant, Paul wrote:
> > >
> > >
> > > On Tue, May 31, 2022 at 02:02:36PM +0000, Jack Allister wrote:
> > > > The reasoning behind this is that you may want to run a guest at a
> > > > lower CPU frequency for the purposes of trying to match performance
> > > > parity between a host of an older CPU type to a newer faster one.
> > >
> > > That's quite ludicrus. Also, then it should be the host enforcing the
> > > cpufreq, not the guest.
> >
> > I'll bite... What's ludicrous about wanting to run a guest at a lower
> > CPU freq to minimize observable change in whatever workload it is
> > running?
>
> *why* would you want to do that? Everybody wants their stuff done
> faster.

We're running out of older hardware on which VMs have been started, and
these have to be moved to newer hardware.

We want the customer experience to stay as close to the current
situation as possible (i.e. no surprises), as this is just a live-
migration event for these instances.

Live migration events happen today as well, within the same hardware
and hypervisor cluster. But this hw deprecation thing is going to be
new -- meaning customers and workloads aren't used to having hw
characteristics change as part of LM events.

> If this is some hare-brained money scheme; must not give them if they
> didn't pay up then I really don't care.

Many workloads that are still tied to the older generation instances we
offer are there for a reason. EC2's newer instance generations have a
better price and performance than the older ones; yet folks use the
older ones. We don't want to guess as to why that is. We just want
these workloads to continue running w/o changes or w/o customers having
to even think about these things, while running on supported hardware.

So as infrastructure providers, we're doing everything possible behind-
the-scenes to ensure there's as little disruption to existing workloads
as possible.

> On top of that, you can't hide uarch differences with cpufreq capping.

Yes, this move (old hw -> new hw) isn't supposed to be "hide from the
instances we're doing this". It's rather "try to match the
capabilities to the older hw as much as possible".

Some software will adapt to these changes; some software won't. We're
aiming to be ready for both scenarios as far as software allows us.

> Also, it is probably more power efficient to let it run faster and idle
> more, so you're not being environmental either.

Agreed; I can chat about that quite a bit, but that doesn't apply to
this context.

Amit