Re: [PATCH 009/190] Revert "media: s5p-mfc: Fix a reference count leak"

From: Mauro Carvalho Chehab
Date: Fri Apr 23 2021 - 06:23:16 EST


Em Fri, 23 Apr 2021 11:42:55 +0200
Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> escreveu:

> (adding c/c to Rafael)
>
> Em Fri, 23 Apr 2021 10:41:32 +0200 (CEST)
> Julia Lawall <julia.lawall@xxxxxxxx> escreveu:
>
> > On Fri, 23 Apr 2021, Krzysztof Kozlowski wrote:
> >
> > > On 23/04/2021 10:10, Hans Verkuil wrote:
> > > > On 23/04/2021 10:07, Mauro Carvalho Chehab wrote:
> > > >> Em Fri, 23 Apr 2021 09:10:32 +0200
> > > >> Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> escreveu:
> > > >>
> > > >>> On Fri, Apr 23, 2021 at 09:04:27AM +0200, Krzysztof Kozlowski wrote:
> > > >>>> On 21/04/2021 14:58, Greg Kroah-Hartman wrote:
> > > >>>>> This reverts commit 78741ce98c2e36188e2343434406b0e0bc50b0e7.
> > > >>>>>
> > > >>>>> Commits from @umn.edu addresses have been found to be submitted in "bad
> > > >>>>> faith" to try to test the kernel community's ability to review "known
> > > >>>>> malicious" changes. The result of these submissions can be found in a
> > > >>>>> paper published at the 42nd IEEE Symposium on Security and Privacy
> > > >>>>> entitled, "Open Source Insecurity: Stealthily Introducing
> > > >>>>> Vulnerabilities via Hypocrite Commits" written by Qiushi Wu (University
> > > >>>>> of Minnesota) and Kangjie Lu (University of Minnesota).
> > > >>>>>
> > > >>>>> Because of this, all submissions from this group must be reverted from
> > > >>>>> the kernel tree and will need to be re-reviewed again to determine if
> > > >>>>> they actually are a valid fix. Until that work is complete, remove this
> > > >>>>> change to ensure that no problems are being introduced into the
> > > >>>>> codebase.
> > > >>>>>
> > > >>>>> Cc: Qiushi Wu <wu000273@xxxxxxx>
> > > >>>>> Cc: Hans Verkuil <hverkuil-cisco@xxxxxxxxx>
> > > >>>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx>
> > > >>>>> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > >>>>> ---
> > > >>>>> drivers/media/platform/s5p-mfc/s5p_mfc_pm.c | 4 +---
> > > >>>>> 1 file changed, 1 insertion(+), 3 deletions(-)
> > > >>>>>
> > > >>>>
> > > >>>> This looks like a good commit but should be done now in a different way
> > > >>>> - using pm_runtime_resume_and_get(). Therefore I am fine with revert
> > > >>>> and I can submit later better fix.
> > > >>>
> > > >>> Great, thanks for letting me know, I can have someone work on the
> > > >>> "better fix" at the same time.
> > > >>
> > > >> IMO, it is better to keep the fix. I mean, there's no reason to
> > > >> revert a fix that it is known to be good.
> > > >>
> > > >> The "better fix" patch can be produced anytime. A simple coccinelle
> > > >> ruleset can replace patterns like:
> > > >>
> > > >> ret = pm_runtime_get_sync(pm->device);
> > > >> if (ret < 0) {
> > > >> pm_runtime_put_noidle(pm->device);
> > > >> return ret;
> > > >> }
> > > >>
> > > >> and the broken pattern:
> > > >>
> > > >> ret = pm_runtime_get_sync(pm->device);
> > > >> if (ret < 0)
> > > >> return ret;
> > > >>
> > > >> to:
> > > >>
> > > >> ret = pm_runtime_resume_and_get(pm->device);
> > > >> if (ret < 0)
> > > >> return ret;
> > > >
> > > > That's my preference as well.
> > >
> > > It won't be that easy because sometimes the error handling is via goto
> > > (like in other patches here) but anyway I don't mind keeping the
> > > original commits.
> >
> > I tried the following semantic patch:
> >
> > @@
> > expression ret,e;
> > @@
> >
> > - ret = pm_runtime_get_sync(e);
> > + ret = pm_resume_and_get(e);
> > if (ret < 0) {
> > ...
> > ?- pm_runtime_put_noidle(e);
> > ...
> > return ret;
> > }
> >
> > It has the following features:
> >
> > * The ? means that if pm_runtime_put_noidle is absent, the transformation
> > will happen anyway.
> >
> > * The ... before the return means that the matching will jump over a goto.
> >
> > It makes a lot of changes (in a kernel I had handy from March).
>
> I would expect lots of changes, as the pm_runtime_resume_and_get() was only
> recently introduced on this changeset:
>
> commit dd8088d5a8969dc2b42f71d7bc01c25c61a78066
> Author: Zhang Qilong <zhangqilong3@xxxxxxxxxx>
> Date: Tue Nov 10 17:29:32 2020 +0800
>
> PM: runtime: Add pm_runtime_resume_and_get to deal with usage counter
>
> In many case, we need to check return value of pm_runtime_get_sync, but
> it brings a trouble to the usage counter processing. Many callers forget
> to decrease the usage counter when it failed, which could resulted in
> reference leak. It has been discussed a lot[0][1]. So we add a function
> to deal with the usage counter for better coding.
>
> [0]https://lkml.org/lkml/2020/6/14/88
> [1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139
> Signed-off-by: Zhang Qilong <zhangqilong3@xxxxxxxxxx>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
>
> > This is a
> > complicated API, however, and I don't know if there are any other issues
> > to take into account, especially in the case where the call to
> > pm_runtime_put_noidle is not present.
>
> I double-checked the code, despite its name, pm_runtime_put_noidle() just
> changes the refcount. See, the relevant code is here:
>
> static inline void pm_runtime_put_noidle(struct device *dev)
> {
> atomic_add_unless(&dev->power.usage_count, -1, 0);
> }
>
> static inline int pm_runtime_get_sync(struct device *dev)
> {
> return __pm_runtime_resume(dev, RPM_GET_PUT);
> }
>
> int __pm_runtime_resume(struct device *dev, int rpmflags)
> {
> unsigned long flags;
> int retval;
>
> might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe &&
> dev->power.runtime_status != RPM_ACTIVE);
>
> if (rpmflags & RPM_GET_PUT)
> atomic_inc(&dev->power.usage_count);
>
> spin_lock_irqsave(&dev->power.lock, flags);
> retval = rpm_resume(dev, rpmflags);
> spin_unlock_irqrestore(&dev->power.lock, flags);
>
> return retval;
> }
>
> Not being an expert at the PM runtime API, at least on my eyes,
> replacing pm_runtime_get_sync() by pm_runtime_resume_and_get()
> seems to be the right thing to do, but Rafael should know more.

The cocci recipe made some wrong error check logic on some drivers,
like, for instance: drivers/media/i2c/mt9m001.c.

On such drivers, the logic is like:

ret = pm_runtime_resume_get_sync(&client->dev);
if (ret < 0)
goto put_unlock;
ret = mt9m001_apply_selection(sd);
if (ret)
goto put_unlock;


...
put_unlock:
pm_runtime_put(&client->dev);
mutex_unlock(&mt9m001->mutex);

On those cases where there are gotos to a logic with either
pm_runtime_put() or pm_runtime_put_no_idle(), the coccinelle
receipt should probably ignore it.

Thanks,
Mauro