Re: [PATCH v2] kernel/resource.c: invalidate parent when freed resource has childs

From: Wei Yang
Date: Thu Aug 15 2019 - 09:03:35 EST


On Thu, Aug 15, 2019 at 08:18:06AM +0000, Schmid, Carsten wrote:
>>>When a resource is freed and has children, the childrens are
>>
>> s/childrens/children/
>>
>oh, missed that. Too many children ... ;-)
>
>>>+ __release_child_resources(tmp, warn);
>>
>> This function will release all the children.
>>
>> Is this what Linus suggest?
>>
>> From his code snippet, I just see siblings parent is set to NULL. I may miss
>> some point?
>>
>At the point we are here, there should be no children, and children of
>children at all ...
>So they are all more or less lost in the wild.
>That was why i didn't copy Linus' code 1:1 but reused an already existing
>function doing similar thing.
>It's anyway worth of thinking about this.
>
>What i have in mind here (example):
>Parent: iomem map 0x1000..0x1FFF
> Child1: iomem map 0x1000..0x17FF
> Child11: iomem map 0x1000..0x13FF
> Child12: iomem map 0x1400..0x17FF
> Child2: iomem map 0x1800..0x1FFF
> Child21: iomem map 0x1800..0x1BFF
> Child22: iomem map 0x1C00..0x1FFF
>
>When releasing the parent, how can children 11, 12, 21 and 22 still be valid?
>They don't know about their grandfather died ...
>Looking at the __release_child_resources, i exactly found that all children are
>invalidated/released in the way Linus did for the parent's children list.
>Doesn't it make sense to do the same for all?
>
>Please comment.
>
>> >+static void check_children(struct resource *parent)
>> >+{
>> >+ if (parent->child) {
>> >+ /* warn and release all children */
>> >+ WARN_ONCE(1, "%s: %s has child %s, release all children\n",
>> >+ __func__, parent->name, parent->child-
>> >name);
>> >+ write_lock(&resource_lock);
>>
>> In previous version, lock is grasped before parent->child is checked.
>>
>> Not sure why you change the order?
>>
>To hold the lock as short as possible.
>But yes, you are right, this could lead to problems if releasing of the
>children is done in a parallel thread on a multicore ...
>I'll change that to cover the whole resource access within the lock.
>Not a big thing ...
>

My gut feeling is this is the problem from mal-functional driver, e.g.
xhci-hcd. We do our best to protect core kernel from it instead of do the
cleanup for it.

So my suggestion is to look into why xhci-hcd behave like this and fix that.

>Best regards
>Carsten

--
Wei Yang
Help you, Help me