Re: [PATCH v4 3/6] vfio: Increment the runtime PM usage count during IOCTL call

From: Alex Williamson
Date: Fri Jul 08 2022 - 12:49:33 EST

On Fri, 8 Jul 2022 15:13:16 +0530
Abhishek Sahu <abhsahu@xxxxxxxxxx> wrote:

> On 7/6/2022 9:10 PM, Alex Williamson wrote:
> > On Fri, 1 Jul 2022 16:38:11 +0530
> > Abhishek Sahu <abhsahu@xxxxxxxxxx> wrote:
> >
> >> The vfio-pci based driver will have runtime power management
> >> support where the user can put the device into the low power state
> >> and then PCI devices can go into the D3cold state. If the device is
> >> in the low power state and the user issues any IOCTL, then the
> >> device should be moved out of the low power state first. Once
> >> the IOCTL is serviced, then it can go into the low power state again.
> >> The runtime PM framework manages this with help of usage count.
> >>
> >> One option was to add the runtime PM related API's inside vfio-pci
> >> driver but some IOCTL (like VFIO_DEVICE_FEATURE) can follow a
> >> different path and more IOCTL can be added in the future. Also, the
> >> runtime PM will be added for vfio-pci based drivers variant currently,
> >> but the other VFIO based drivers can use the same in the
> >> future. So, this patch adds the runtime calls runtime-related API in
> >> the top-level IOCTL function itself.
> >>
> >> For the VFIO drivers which do not have runtime power management
> >> support currently, the runtime PM API's won't be invoked. Only for
> >> vfio-pci based drivers currently, the runtime PM API's will be invoked
> >> to increment and decrement the usage count.
> >
> > Variant drivers can easily opt-out of runtime pm support by performing
> > a gratuitous pm-get in their device-open function.
> >
> Do I need to add this line in the commit message?

Maybe I misinterpreted, but my initial read was that there was some
sort of opt-in, which there is by providing pm-runtime support in the
driver, which vfio-pci-core does for all vfio-pci variant drivers. But
there's also an opt-out, where a vfio-pci variant driver might not want
to support pm-runtime support and could accomplish that by bumping the
pm reference count on device-open such that the user cannot trigger a

> >> Taking this usage count incremented while servicing IOCTL will make
> >> sure that the user won't put the device into low power state when any
> >> other IOCTL is being serviced in parallel. Let's consider the
> >> following scenario:
> >>
> >> 1. Some other IOCTL is called.
> >> 2. The user has opened another device instance and called the power
> >> management IOCTL for the low power entry.
> >> 3. The power management IOCTL moves the device into the low power state.
> >> 4. The other IOCTL finishes.
> >>
> >> If we don't keep the usage count incremented then the device
> >> access will happen between step 3 and 4 while the device has already
> >> gone into the low power state.
> >>
> >> The runtime PM API's should not be invoked for
> >> VFIO_DEVICE_FEATURE_POWER_MANAGEMENT since this IOCTL itself performs
> >> the runtime power management entry and exit for the VFIO device.
> >
> > I think the one-shot interface I proposed in the previous patch avoids
> > the need for special handling for these feature ioctls. Thanks,
> >
> Okay. So, for low power exit case (means feature GET ioctl in the
> updated case) also, we will trigger eventfd. Correct?

If all ioctls are wrapped in pm-get/put, then the pm feature exit ioctl
would wakeup and signal the eventfd via the pm-get. I'm not sure if
it's worthwhile to try to surprise this eventfd. Do you foresee any
issues? Thanks,