Re: [PATCH] PM / clock_ops: Fix clock error check in __pm_clk_add()

From: Rafael J. Wysocki
Date: Wed May 13 2015 - 18:20:25 EST


On Tuesday, May 12, 2015 05:32:29 PM Dmitry Torokhov wrote:
> On Wed, May 13, 2015 at 02:22:50AM +0200, Rafael J. Wysocki wrote:
> > On Tuesday, May 12, 2015 11:07:33 AM Dmitry Torokhov wrote:
> > > On Tue, May 12, 2015 at 08:59:03PM +0300, Grygorii.Strashko@xxxxxxxxxx wrote:
> > > > On 05/12/2015 07:42 PM, Dmitry Torokhov wrote:
> > > > > On Tue, May 12, 2015 at 04:55:39PM +0300, Grygorii.Strashko@xxxxxxxxxx wrote:
> > > > >> On 05/09/2015 12:05 AM, Dmitry Torokhov wrote:
> > > > >>> On Fri, May 08, 2015 at 10:59:04PM +0200, Geert Uytterhoeven wrote:
> > > > >>>> On Fri, May 8, 2015 at 7:19 PM, Dmitry Torokhov
> > > > >>>> <dmitry.torokhov@xxxxxxxxx> wrote:
> > > > >>>>> On Fri, May 08, 2015 at 10:47:43AM +0200, Geert Uytterhoeven wrote:
> > > > >>>>>> In the final iteration of commit 245bd6f6af8a62a2 ("PM / clock_ops: Add
> > > > >>>>>> pm_clk_add_clk()"), a refcount increment was added by Grygorii Strashko.
> > > > >>>>>> However, the accompanying IS_ERR() check operates on the wrong clock
> > > > >>>>>> pointer, which is always zero at this point, i.e. not an error.
> > > > >>>>>> This may lead to a NULL pointer dereference later, when __clk_get()
> > > > >>>>>> tries to dereference an error pointer.
> > > > >>>>>>
> > > > >>>>>> Check the passed clock pointer instead to fix this.
> > > > >>>>>
> > > > >>>>> Frankly I would remove the check altogether. Why do we only check for
> > > > >>>>> IS_ERR and not NULL or otherwise validate the pointer? The clk is passed
> > > > >>>>
> > > > >>>> __clk_get() does the NULL check.
> > > > >>>
> > > > >>> No, not really. It _handles_ clk being NULL and returns "everything is
> > > > >>> fine". In any case it is __clk_get's decision what to do.
> > > > >>>
> > > > >>> I dislike gratuitous checks of arguments passed in. Instead of relying
> > > > >>> on APIs refusing grabage we better not pass garbage to these APIs in the
> > > > >>> first place. So I'd change it to trust that we are given a usable
> > > > >>> pointer and simply do:
> > > > >>>
> > > > >>> if (!__clk_get(clk)) {
> > > > >>> kfree(ce);
> > > > >>> return -ENOENTl
> > > > >>> }
> > > > >>
> > > > >> Not sure this is right thing to do, because this API initially
> > > > >> was intended to be used as below [1]:
> > > > >> clk = of_clk_get(dev->of_node, i));
> > > > >> ret = pm_clk_add_clk(dev, clk);
> > > > >> clk_put(clk);
> > > > >>
> > > > >> and of_clk_get may return ERR_PTR().
> > > > >
> > > > > Jeez, that sequence was not meant to be taken literally, it does miss
> > > > > error handling completely. If you notice the majority of users of this
> > > > > API do something like below:
> > > > >
> > > > > i = 0;
> > > > > while ((clk = of_clk_get(dev->of_node, i++)) && !IS_ERR(clk)) {
> > > > > dev_dbg(dev, "adding clock '%s' to list of PM clocks\n",
> > > > > __clk_get_name(clk));
> > > > > error = pm_clk_add_clk(dev, clk);
> > > > > clk_put(clk);
> > > > > if (error) {
> > > > > dev_err(dev, "pm_clk_add_clk failed %d\n", error);
> > > > > pm_clk_destroy(dev);
> > > > > return error;
> > > > > }
> > > > > }
> > > > >
> > > > > i.e. it already validates clk pointer before passing it on since it
> > > > > needs to know when to stop iterating.
> > > >
> > > > np. It's just my opinion - if you agree that code will just crash
> > > > in case of passing invalid @clk argument (in worst case:)
> > > >
> > > > int __clk_get(struct clk *clk)
> > > > {
> > > > struct clk_core *core = !clk ? NULL : clk->core;
> > > > ^^^ here
> > >
> > > Yes, it will crash if you pass invalid pointer here, be it
> > > ERR_PTR-encoded value, or, for example, 0x1, or maybe (void
> > > *)random_32(). The latter will probably not crash right away, but cause
> > > some random damage that will manifest later.
> >
> > Oh well. Shouldn't we actually do:
> >
> > int __clk_get(struct clk *clk)
> > {
> > struct clk_core *core = IS_ERR_OR_NULL(clk) ? NULL : clk->core;
> >
> > and remove the check from __pm_clk_add() at the same time?
> >
> > Knowingly crashing on an error encoded as a pointer is kind of disgusting to me
> > and the difference between that and a random invalid pointer is that poeple who
> > pass error values encoded as pointers up the stack usually expect them to be
> > handled cleanly.
>
> I think the operative work here is "up". Returning ERR_PTR-encoded
> pointer is fine, checking it fine as well, blindly passing it *down*
> into a random API is not fine and we should not try to accommodate this.

You're basically saying "Passing an error-encoding pointer down to an API is
not valid" which I agree with, but I don't agree that it's OK to crash the
kernel when that happens. It's never OK to crash the kernel when we can
easily avoid that, because it may lead to user data loss.

However, you seem to be arguing against fixing up things *silently* which may
hide serious bugs. That's a good point, so what about adding a WARN_ON_ONCE()
aroud the IS_ERR() check in the Geert's patch?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/