RE: [RFC] Documentation: Add documentation for new performance_profile sysfs class
From: Limonciello, Mario
Date: Wed Oct 07 2020 - 14:41:44 EST
> On Wed, 2020-10-07 at 15:58 +0000, Limonciello, Mario wrote:
> >
> > > On Mon, 2020-10-05 at 12:58 +0000, Limonciello, Mario wrote:
> > > > > On modern systems CPU/GPU/... performance is often dynamically
> > > > > configurable
> > > > > in the form of e.g. variable clock-speeds and TPD. The
> > > > > performance
> > > > > is often
> > > > > automatically adjusted to the load by some automatic-mechanism
> > > > > (which may
> > > > > very well live outside the kernel).
> > > > >
> > > > > These auto performance-adjustment mechanisms often can be
> > > > > configured with
> > > > > one of several performance-profiles, with either a bias towards
> > > > > low-power
> > > > > consumption (and cool and quiet) or towards performance (and
> > > > > higher
> > > > > power
> > > > > consumption and thermals).
> > > > >
> > > > > Introduce a new performance_profile class/sysfs API which
> > > > > offers a
> > > > > generic
> > > > > API for selecting the performance-profile of these automatic-
> > > > > mechanisms.
> > > > >
> > > >
> > > > If introducing an API for this - let me ask the question, why
> > > > even let each
> > > > driver offer a class interface and userspace need to change
> > > > "each" driver's
> > > > performance setting?
> > > >
> > > > I would think that you could just offer something kernel-wide
> > > > like
> > > > /sys/power/performance-profile
> > > >
> > > > Userspace can read and write to a single file. All drivers can
> > > > get notified
> > > > on this sysfs file changing.
> > > >
> > > > The systems that react in firmware (such as the two that prompted
> > > > this discussion) can change at that time. It leaves the
> > > > possibility for a
> > > > more open kernel implementation that can do the same thing though
> > > > too by
> > > > directly modifying device registers instead of ACPI devices.
> > >
> > > The problem, as I've mentioned in previous discussions we had about
> > > this, is that, as you've seen in replies to this mail, this would
> > > suddenly be making the kernel apply policy.
> > >
> > > There's going to be pushback as soon as policy is enacted in the
> > > kernel, and you take away the different knobs for individual
> > > components
> > > (or you can control them centrally as well as individually). As
> > > much as
> > > I hate the quantity of knobs[1], I don't think that trying to
> > > reduce
> > > the number of knobs in the kernel is a good use of our time, and
> > > easier
> > > to enact, coordinated with design targets, in user-space.
> > >
> > > Unless you can think of a way to implement this kernel wide setting
> > > without adding one more exponent on the number of possibilities for
> > > the
> > > testing matrix, I'll +1 Hans' original API.
> > >
> > Actually I offered two proposals in my reply. So are you NAKing
> > both?
>
> No, this is only about the first portion of the email, which I quoted.
> And I'm not NAK'ing it, but I don't see how it can work without being
> antithetical to what kernel "users" expect, or what the folks consuming
> those interfaces (presumably us both) would expect to be able to test
> and maintain.
>
(Just so others are aware, Bastien and I had a previous discussion on this topic
that he alluded to here: https://gitlab.freedesktop.org/hadess/power-profiles-daemon/-/issues/1)
In general I agree that we shouldn't be offering 100's of knobs to change
things and protect users from themselves where possible.
Whether the decisions are made in the kernel or in userspace you still have a matrix once
you're letting someone change 2 different kernel devices that offer policy. I'd argue it's
actually worse if you let userspace change it though.
Let's go back to the my GPU and platform example and lets say both offer the new knob here
for both. Userspace software such as your PPD picks performance. Both the platform device
and GPU device get changed, hopefully no conflicts.
Then user decides no, I don't want my GPU in performance mode, I only want my platform.
So they change the knob for the GPU manually, and now you have a new config in your matrix.
However if you left it to a single kernel knob, both GPU and platform get moved together and
you don't have these extra configs in your matrix anymore.
The other point I mentioned, that platform might also do something to GPU via a sideband and
you race, you can solve it with kernel too by modifying the ordering the kernel handles it.
Userspace however, you give two knobs and now you have to worry about them getting it right
and supporting them doing them in the wrong order.
> > The other one suggested to use the same firmware attributes class
> > being
> > introduced by the new Dell driver (
> > https://patchwork.kernel.org/patch/11818343/)
> > since this is actually a knob to a specific firmware setting.
>
> This seemed to me like an implementation detail (eg. the same metadata
> is being exported, but in a different way), and I don't feel strongly
> about it either way.
OK thanks.