Re: [PATCH 1/2] device property: do not leak child nodes when using NULL/error pointers

From: Andy Shevchenko
Date: Fri Nov 29 2024 - 09:50:28 EST


On Thu, Nov 28, 2024 at 03:04:50PM -0800, Dmitry Torokhov wrote:
> On Thu, Nov 28, 2024 at 03:13:16PM +0200, Andy Shevchenko wrote:
> > On Wed, Nov 27, 2024 at 09:39:34PM -0800, Dmitry Torokhov wrote:
> > > The documentation to various API calls that locate children for a given
> > > fwnode (such as fwnode_get_next_available_child_node() or
> > > device_get_next_child_node()) states that the reference to the node
> > > passed in "child" argument is dropped unconditionally, however the
> > > change that added checks for the main node to be NULL or error pointer
> > > broke this promise.
> >
> > This commit message doesn't explain a use case. Hence it might be just
> > a documentation issue, please elaborate.
>
> I do not have a specific use case in mind, however the implementation
> behavior does not match the stated one, and so it makes sense to get it
> fixed. Otherwise callers would have to add checks to conditionally drop
> the reference to "child" argument in certain cases, which will
> complicate caller's code.

Perhaps this should be somewhere between the cover letter / commit message?

> > > Add missing fwnode_handle_put() calls to restore the documented
> > > behavior.

...

> > > {
> > > + if (IS_ERR_OR_NULL(fwnode) ||
> >
> > Unneeded check as fwnode_has_op() has it already.
>
> Yes, it has, but that is not obvious nor it is a documented behavior of
> fwnode_has_op().

Would like to document that then?

> It also different semantics: it checks whether a fwnode
> implements a given operation, not whether fwnode is valid. That check is
> incidental in fwnode_has_op().

I kinda disagree on this. The invalid fwnode may not have any operations,
so it's implied and will always be like that.

> They all are macros so compiler should collapse duplicate checks, but if
> you feel really strongly about it I can drop IS_ERR_OR_NULL() check.

Yes, please drop it and rather we want fwnode_has_op() to be documented with
main purpose and guaranteed side effect (the latter makes no need of
duplication that I pointed out).

> > > + !fwnode_has_op(fwnode, get_next_child_node)) {
> > > + fwnode_handle_put(child);
> > > + return NULL;
> > > + }

...

> > > @@ struct fwnode_handle *device_get_next_child_node(const struct device *dev,
> > > const struct fwnode_handle *fwnode = dev_fwnode(dev);
> > > struct fwnode_handle *next;
> >
> > > - if (IS_ERR_OR_NULL(fwnode))
> > > + if (IS_ERR_OR_NULL(fwnode)) {
> > > + fwnode_handle_put(child);
> > > return NULL;
> > > + }
> >
> > > /* Try to find a child in primary fwnode */
> > > next = fwnode_get_next_child_node(fwnode, child);
> >
> > So, why not just moving the original check (w/o dropping the reference) here?
> > Wouldn't it have the same effect w/o explicit call to the fwnode_handle_put()?
>
> Because if you rely on check in fwnode_get_next_child_node() you would
> not know if it returned NULL because there are no more children or
> because the node is invalid. In the latter case you can't dereference
> fwnode->secondary.

Yes, so, how does it contradict my proposal?

--
With Best Regards,
Andy Shevchenko