Re: [PATCH v3] PM / core: conditionally skip system pm in device/driver model

From: Guan-Yu Lin
Date: Thu Feb 29 2024 - 04:08:23 EST


On Wed, Feb 28, 2024 at 1:57 AM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
>
> On 2/27/24 00:56, Guan-Yu Lin wrote:
> > On Tue, Feb 27, 2024 at 2:40 AM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
> >>
> >> On 2/26/24 02:28, Guan-Yu Lin wrote:
> >>> On Sat, Feb 24, 2024 at 2:20 AM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
> >>>>
> >>>> On 2/23/24 06:38, Guan-Yu Lin wrote:
> >>>>> In systems with a main processor and a co-processor, asynchronous
> >>>>> controller management can lead to conflicts. One example is the main
> >>>>> processor attempting to suspend a device while the co-processor is
> >>>>> actively using it. To address this, we introduce a new sysfs entry
> >>>>> called "conditional_skip". This entry allows the system to selectively
> >>>>> skip certain device power management state transitions. To use this
> >>>>> feature, set the value in "conditional_skip" to indicate the type of
> >>>>> state transition you want to avoid. Please review /Documentation/ABI/
> >>>>> testing/sysfs-devices-power for more detailed information.
> >>>>
> >>>> This looks like a poor way of dealing with a lack of adequate resource
> >>>> tracking from Linux on behalf of the co-processor(s) and I really do not
> >>>> understand how someone is supposed to use that in a way that works.
> >>>>
> >>>> Cannot you use a HW maintained spinlock between your host processor and
> >>>> the co-processor such that they can each claim exclusive access to the
> >>>> hardware and you can busy-wait until one or the other is done using the
> >>>> device? How is your partitioning between host processor owned blocks and
> >>>> co-processor(s) owned blocks? Is it static or is it dynamic?
> >>>> --
> >>>> Florian
> >>>>
> >>>
> >>> This patch enables devices to selectively participate in system power
> >>> transitions. This is crucial when multiple processors, managed by
> >>> different operating system kernels, share the same controller. One
> >>> processor shouldn't enforce the same power transition procedures on
> >>> the controller – another processor might be using it at that moment.
> >>> While a spinlock is necessary for synchronizing controller access, we
> >>> still need to add the flexibility to dynamically customize power
> >>> transition behavior for each device. And that's what this patch is
> >>> trying to do.
> >>> In our use case, the host processor and co-processor are managed by
> >>> separate operating system kernels. This arrangement is static.
> >>
> >> OK, so now the question is whether the peripheral is entirely visible to
> >> Linux, or is it entirely owned by the co-processor, or is there a
> >> combination of both and the usage of the said device driver is dynamic
> >> between Linux and your co-processor?
> >>
> >> A sysfs entry does not seem like the appropriate way to described which
> >> states need to be skipped and which ones can remain under control of
> >> Linux, you would have to use your firmware's description for that (ACPI,
> >> Device Tree, etc.) such that you have a more comprehensive solution that
> >> can span a bigger scope.
> >> --
> >> Florian
> >>
> >
> > We anticipate that control of the peripheral (e.g., controller) will
> > be shared between operating system kernels. Each kernel will need its
> > own driver for peripheral communication. To accommodate different
> > tasks, the operating system managing the peripheral can change
> > dynamically at runtime.
>
> OK, that seems like this ought to be resolved at various layer other
> than just user-space, starting possibly with an
> overarching/reconciliation layer between the various operating systems?
>

We achieve cooperation between operating system kernels by assigning
interrupts to corresponding kernels, and only one kernel could write
commands to the peripheral.

> >
> > We dynamically select the operating system kernel controlling the
> > target peripheral based on the task at hand, which looks more like a
> > software behavior rather than hardware behavior to me. I agree that we
> > might need a firmware description for "whether another operating
> > system exists for this peripheral", but we also need to store the
> > information about "whether another operating system is actively using
> > this peripheral". To me, the latter one looks more like a sysfs entry
> > rather than a firmware description as it's not determined statically.
>
> I can understand why moving this sort of decisions to user-space might
> sound appealing, but it also seems like if the peripheral is going to be
> "stolen" away from Linux, then maybe Linux should not be managing it at
> all, e.g.: unbind the device from its driver, and then rebind it when
> Linux needs to use it. You would have to write your drivers such that
> they can skip the peripheral's initialization if you need to preserve
> state from the previous agent after an ownership change for instance?
>
> I do not think you are painting a full picture of your use case,
> hopefully not intentionally but at first glance it sounds like you need
> a combination of kernel-level changes to your drivers, and possibly more.
>
> Seems like more details need to be provided about the overall intended
> use cases such that people can guide you with a fuller picture of the
> use cases.
> --
> Florian
>

Let me introduce the scenario of our real-world use case. The
peripheral (controller) can issue multiple interrupts, which are
handled respectively by two operating system kernels (Linux and a
non-Linux). In addition, only one kernel can issue commands to the
peripheral. Although we have successfully distributed control of this
peripheral between the kernels, Linux's system power management still
applies power transition rules to the entire peripheral without
awareness of the other kernel's activity. In other words, the Linux
kernel has partial responsibility for the peripheral's functionality,
but its power management decisions affect the entire peripheral. This
can potentially interfere with the non-Linux kernel's operations.

We want to introduce a mechanism that allows the Linux kernel to make
power transitions for the peripheral based on whether the other
operating system kernel is actively using it. To achieve this, we
propose this patch that adds a sysfs attribute, providing the Linux
kernel with the necessary information.