Re: [PATCH v5 2/6] powercap/drivers/dtpm: Add hierarchy creation

From: Ulf Hansson
Date: Wed Jan 12 2022 - 07:00:51 EST


On Tue, 11 Jan 2022 at 18:52, Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote:
>
> On 11/01/2022 09:28, Ulf Hansson wrote:
> > On Mon, 10 Jan 2022 at 16:55, Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote:
> >>
> >> On 07/01/2022 16:54, Ulf Hansson wrote:
> >>> [...]
> >>>
> >>>>>> +static int dtpm_for_each_child(const struct dtpm_node *hierarchy,
> >>>>>> + const struct dtpm_node *it, struct dtpm *parent)
> >>>>>> +{
> >>>>>> + struct dtpm *dtpm;
> >>>>>> + int i, ret;
> >>>>>> +
> >>>>>> + for (i = 0; hierarchy[i].name; i++) {
> >>>>>> +
> >>>>>> + if (hierarchy[i].parent != it)
> >>>>>> + continue;
> >>>>>> +
> >>>>>> + dtpm = dtpm_node_callback[hierarchy[i].type](&hierarchy[i], parent);
> >>>>>> + if (!dtpm || IS_ERR(dtpm))
> >>>>>> + continue;
> >>>>>> +
> >>>>>> + ret = dtpm_for_each_child(hierarchy, &hierarchy[i], dtpm);
> >>>>>
> >>>>> Why do you need to recursively call dtpm_for_each_child() here?
> >>>>>
> >>>>> Is there a restriction on how the dtpm core code manages adding
> >>>>> children/parents?
> >>>>
> >>>> [ ... ]
> >>>>
> >>>> The recursive call is needed given the structure of the tree in an array
> >>>> in order to connect with the parent.
> >>>
> >>> Right, I believe I understand what you are trying to do here, but I am
> >>> not sure if this is the best approach to do this. Maybe it is.
> >>>
> >>> The problem is that we are also allocating memory for a dtpm and we
> >>> call dtpm_register() on it in this execution path - and this memory
> >>> doesn't get freed up nor unregistered, if any of the later recursive
> >>> calls to dtpm_for_each_child() fails.
> >>>
> >>> The point is, it looks like it can get rather messy with the recursive
> >>> calls to cope with the error path. Maybe it's easier to store the
> >>> allocated dtpms in a list somewhere and use this to also find a
> >>> reference of a parent?
> >>
> >> I think it is better to continue the construction with other nodes even
> >> some of them failed to create, it should be a non critical issue. As an
> >> analogy, if one thermal zone fails to create, the other thermal zones
> >> are not removed.
> >
> > Well, what if it fails because its "consumer part" is waiting for some
> > resource to become available?
> >
> > Maybe the devfreq driver/subsystem isn't available yet and causes
> > -EPROBE_DEFER, for example. Perhaps this isn't the way the dtpm
> > registration works currently, but sure it's worth considering when
> > going forward, no?
>
> It should be solved by the fact that the DTPM description is a module
> and loaded after the system booted. The module loading ordering is
> solved by userspace.

Ideally, yes. However, drivers/subsystems in the kernel should respect
-EPROBE_DEFER. It's good practice to do that.

>
> I agree, we could improve that but it is way too complex to be addressed
> in a single series and should be part of a specific change IMO.

It's not my call to make, but I don't agree, sorry.

In my opinion, plain error handling to avoid leaking memory isn't
something that should be addressed later. At least if the problems are
already spotted during review.

>
> > In any case, papering over the error seems quite scary to me. I would
> > much prefer if we instead could propagate the error code correctly to
> > the caller of dtpm_create_hierarchy(), to allow it to retry if
> > necessary.
>
> It is really something we should be able to address later.
>

[...]

Kind regards
Uffe