Re: [PATCH 38/78] media: i2c: mt9m001: use pm_runtime_resume_and_get()

From: Johan Hovold
Date: Wed Apr 28 2021 - 06:05:16 EST


On Wed, Apr 28, 2021 at 10:31:48AM +0200, Mauro Carvalho Chehab wrote:
> Em Tue, 27 Apr 2021 14:23:20 +0200
> Johan Hovold <johan@xxxxxxxxxx> escreveu:

> > pm_runtime_get_sync() has worked this way since it was merged 12 years
> > ago, and for someone who's used to this interface this is not such a big
> > deal as you seem to think. Sure, you need to remember to put the usage
> > counter on errors, but that's it (and the other side of that is that you
> > don't need to worry about error handling where it doesn't matter).
>
> Before we have those at PM subsystem, the media had its own way to
> set/disable power for their sub-devices. The PCI and USB drivers
> still use it, instead of pm_runtime, mostly due to historic reasons.
>
> So, basically, its usage at the media subsystem is restricted to
> drivers for embedded systems. The vast majority of drivers supporting
> PM runtime are the I2C camera drivers. The camera drivers can be used
> interchangeable. So, in practice, the same bridge driver can work
> with a lot of different camera models, depending on the hardware
> vendors' personal preferences and the desired max resolution.
>
> So, in thesis, all such drivers should behave exactly the same
> with regards to PM.
>
> However, on most existing drivers, the pm_runtime was added a
> couple of years ago, and by people that are not too familiar
> with the PM subsystem.
>
> That probably explains why there were/are several places that
> do things like this[1]:
>
> ret = pm_runtime_get_sync(dev);
> if (ret < 0)
> return ret;
>
> without taking care of calling a pm_runtime_put*() function.
>
> [1] besides the 13 patches made by UCN addressing it on
> existing code, I discovered the same pattern on a
> couple of other drivers with current upstream code.
>
> That shows a pattern: several media developers are not familiar
> with the usage_count behavior for pm_runtime_get functions.
>
> So, doing this work is not only helping to make the PM support
> more uniform, but it is also helping to solve existing issues.

Sure, I don't doubt that there are issues with the current code too.

> > You're call, but converting functioning drivers where the authors knew
> > what they were doing just because you're not used to the API and risk
> > introducing new bugs in the process isn't necessarily a good idea.
>
> The problem is that the above assumption is not necessarily true:
> based on the number of drivers that pm_runtime_get_sync() weren't
> decrementing usage_count on errors, several driver authors were not
> familiar enough with the PM runtime behavior by the time the drivers
> were written or converted to use the PM runtime, instead of the
> media .s_power()/.s_stream() callbacks.

That may very well be the case. My point is just that this work needs to
be done carefully and by people familiar with the code (and runtime pm)
or you risk introducing new issues.

I really don't want the bot-warning-suppression crew to start with this
for example.

> > Especially since the pm_runtime_get_sync() will continue to be used
> > elsewhere, and possibly even in media in cases where you don't need to
> > check for errors (e.g. remove()).
>
> Talking about the remove(), I'm not sure if just ignoring errors
> there would do the right thing. I mean, if pm_runtime_get_sync()
> fails, probably any attempts to disable clocks and other things
> that depend on PM runtime will also (silently) fail.
>
> This may put the device on an unknown PM and keep clock lines enabled
> after its removal.

Right, a resume failure is a pretty big issue and it's not really clear
how to to even handle that generally. But at remove() time you don't
have much choice but to go on and release resource anyway.

So unless actually implementing some error handling too, using
pm_runtime_sync_get() without checking for errors is still preferred
over pm_runtime_resume_and_get(). That is

pm_runtime_get_sync();
/* cleanup */
pm_runtime_disable()
pm_runtime_put_noidle();

is better than:

ret = pm_runtime_resume_and_get();
/* cleanup */
pm_runtime_disable();
if (ret == 0)
pm_runtime_put_noidle();

unless you also start doing something ret.

Johan